Are you over 18 and want to see adult content?
More Annotations
A complete backup of runwayclothing.com.au
Are you over 18 and want to see adult content?
A complete backup of legacyhotels.co.za
Are you over 18 and want to see adult content?
A complete backup of repairmanuals.co
Are you over 18 and want to see adult content?
Favourite Annotations
A complete backup of https://badi-info.ch
Are you over 18 and want to see adult content?
A complete backup of https://ronikon.ru
Are you over 18 and want to see adult content?
A complete backup of https://ceoforonemonth.com
Are you over 18 and want to see adult content?
A complete backup of https://relaisentrecote.fr
Are you over 18 and want to see adult content?
A complete backup of https://vn88site.com
Are you over 18 and want to see adult content?
A complete backup of https://mbp-japan.com
Are you over 18 and want to see adult content?
A complete backup of https://petsinportraits.com
Are you over 18 and want to see adult content?
A complete backup of https://hyimagetech.com
Are you over 18 and want to see adult content?
A complete backup of https://cadfem.net
Are you over 18 and want to see adult content?
A complete backup of https://5gcrisis.com
Are you over 18 and want to see adult content?
Text
SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
STATISTICAL AGGREGATIONS ON NUMERIC OBJECT ARRAY FIELDS When working with statistical aggregations in ElasticSearch 1.7 I couldn’t find any documentation about how arrays are treated. Of course you need a numeric field for statistical aggregations. In my special case I needed arrays of objects. But this should obviously not make a difference. Read more NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: WHEN SIMPLE IS BETTER: THE BOOLEAN SIMILARITY MODULE The most basic and simple form of a language model for a search engine is the boolean model. When you build up an inverted index, you need to collect all words that make up this index. And then you need to store the so called postings list, a list of all document ids where these words appear. For the boolean model you just store in a table HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with your website.SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
STATISTICAL AGGREGATIONS ON NUMERIC OBJECT ARRAY FIELDS When working with statistical aggregations in ElasticSearch 1.7 I couldn’t find any documentation about how arrays are treated. Of course you need a numeric field for statistical aggregations. In my special case I needed arrays of objects. But this should obviously not make a difference. Read more NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: WHEN SIMPLE IS BETTER: THE BOOLEAN SIMILARITY MODULE The most basic and simple form of a language model for a search engine is the boolean model. When you build up an inverted index, you need to collect all words that make up this index. And then you need to store the so called postings list, a list of all document ids where these words appear. For the boolean model you just store in a table HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with your website. SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
CONTACT - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SASKIA VOLA
Within the european research project ROBUST (Risk and Opportunity management of huge-scale BUSiness communiTy cooperation) which was conducted a few years ago by several universities and companies such as IBM and SAP, some interesting research has been conducted. IMPRINT - SASKIA VOLA Saskia Vola – Textmining ServicesFichtestraße 2210967 BerlinGermany e-mail: info@saskia-vola.comTax-ID: 14/573/00432 ELASTICSEARCH WITH FACETED NAVIGATION IN 15 ElasticSearch with faceted navigation in 15 minutes. ElasticUI is an awesome and very easy to setup framework that enables faceted navigation for ElasticSearch, written in AngularJS. TEXT CLASSIFICATION MADE EASY WITH ELASTICSEARCH Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch. With a background in computational linguistics and several years of FLASK FOR ELASTICSEARCH I have built a nice search-engine template using Python Flask in the backend and providing faceted navigation in the frontend. The frontendwas built using:
HOW TO SPEED UP INDEXING INTO ELASTICSEARCH Exactly. It’s not very long. The more data you want to squeeze in before the index is refreshed and a new segment will be written the longer this refresh interval should be. For write-heavy usecases 30 seconds or 60 seconds is a better refresh interval. If you want to know more about the anatomy of a indexing operation, have a look here.SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
INTRODUCING A GENERIC DYNAMIC MAPPING TEMPLATE FOR Configuring a mapping for ElasticSearch is not required. Per definition and as opposed to Solr, ElasticSearch is schemaless. If not defined, a mapping for a type is created on the fly, based on the first document that is being indexed. If another document that is Readmore
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: STATISTICAL AGGREGATIONS ON NUMERIC OBJECT ARRAY FIELDS When working with statistical aggregations in ElasticSearch 1.7 I couldn’t find any documentation about how arrays are treated. Of course you need a numeric field for statistical aggregations. In my special case I needed arrays of objects. But this should obviously not make a difference. Read more HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with your website.SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
INTRODUCING A GENERIC DYNAMIC MAPPING TEMPLATE FOR Configuring a mapping for ElasticSearch is not required. Per definition and as opposed to Solr, ElasticSearch is schemaless. If not defined, a mapping for a type is created on the fly, based on the first document that is being indexed. If another document that is Readmore
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: STATISTICAL AGGREGATIONS ON NUMERIC OBJECT ARRAY FIELDS When working with statistical aggregations in ElasticSearch 1.7 I couldn’t find any documentation about how arrays are treated. Of course you need a numeric field for statistical aggregations. In my special case I needed arrays of objects. But this should obviously not make a difference. Read more HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with your website. SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more CONTACT - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SASKIA VOLA
Within the european research project ROBUST (Risk and Opportunity management of huge-scale BUSiness communiTy cooperation) which was conducted a few years ago by several universities and companies such as IBM and SAP, some interesting research has been conducted.TEXTMINING ARCHIVES
Welcome to Part 2 of How to use Elasticsearch for Natural Language Processing and Text Mining. It’s been some time since Part 1, so youmight want to
ELASTICSEARCH WITH FACETED NAVIGATION IN 15 ElasticSearch with faceted navigation in 15 minutes. ElasticUI is an awesome and very easy to setup framework that enables faceted navigation for ElasticSearch, written in AngularJS. FLASK FOR ELASTICSEARCH I have built a nice search-engine template using Python Flask in the backend and providing faceted navigation in the frontend. The frontendwas built using:
HOW TO SPEED UP INDEXING INTO ELASTICSEARCH Exactly. It’s not very long. The more data you want to squeeze in before the index is refreshed and a new segment will be written the longer this refresh interval should be. For write-heavy usecases 30 seconds or 60 seconds is a better refresh interval. If you want to know more about the anatomy of a indexing operation, have a look here. TEXT CLASSIFICATION MADE EASY WITH ELASTICSEARCH Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch. With a background in computational linguistics and several years of INSTALL AND SECURE ELASTICSEARCH 1.X ON DIGITAL OCEAN Elasticsearch on Digital Ocean is a great solution to get up and running quickly. Thanks for referencing my security post. As I mentioned I learned those lessons the hard way.SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
INTRODUCING A GENERIC DYNAMIC MAPPING TEMPLATE FOR Configuring a mapping for ElasticSearch is not required. Per definition and as opposed to Solr, ElasticSearch is schemaless. If not defined, a mapping for a type is created on the fly, based on the first document that is being indexed. If another document that is Readmore
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: STATISTICAL AGGREGATIONS ON NUMERIC OBJECT ARRAY FIELDS When working with statistical aggregations in ElasticSearch 1.7 I couldn’t find any documentation about how arrays are treated. Of course you need a numeric field for statistical aggregations. In my special case I needed arrays of objects. But this should obviously not make a difference. Read more HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with your website.SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
INTRODUCING A GENERIC DYNAMIC MAPPING TEMPLATE FOR Configuring a mapping for ElasticSearch is not required. Per definition and as opposed to Solr, ElasticSearch is schemaless. If not defined, a mapping for a type is created on the fly, based on the first document that is being indexed. If another document that is Readmore
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: STATISTICAL AGGREGATIONS ON NUMERIC OBJECT ARRAY FIELDS When working with statistical aggregations in ElasticSearch 1.7 I couldn’t find any documentation about how arrays are treated. Of course you need a numeric field for statistical aggregations. In my special case I needed arrays of objects. But this should obviously not make a difference. Read more HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with your website. SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more CONTACT - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SASKIA VOLA
Within the european research project ROBUST (Risk and Opportunity management of huge-scale BUSiness communiTy cooperation) which was conducted a few years ago by several universities and companies such as IBM and SAP, some interesting research has been conducted.TEXTMINING ARCHIVES
Welcome to Part 2 of How to use Elasticsearch for Natural Language Processing and Text Mining. It’s been some time since Part 1, so youmight want to
ELASTICSEARCH WITH FACETED NAVIGATION IN 15 ElasticSearch with faceted navigation in 15 minutes. ElasticUI is an awesome and very easy to setup framework that enables faceted navigation for ElasticSearch, written in AngularJS. FLASK FOR ELASTICSEARCH I have built a nice search-engine template using Python Flask in the backend and providing faceted navigation in the frontend. The frontendwas built using:
HOW TO SPEED UP INDEXING INTO ELASTICSEARCH Exactly. It’s not very long. The more data you want to squeeze in before the index is refreshed and a new segment will be written the longer this refresh interval should be. For write-heavy usecases 30 seconds or 60 seconds is a better refresh interval. If you want to know more about the anatomy of a indexing operation, have a look here. TEXT CLASSIFICATION MADE EASY WITH ELASTICSEARCH Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch. With a background in computational linguistics and several years of INSTALL AND SECURE ELASTICSEARCH 1.X ON DIGITAL OCEAN Elasticsearch on Digital Ocean is a great solution to get up and running quickly. Thanks for referencing my security post. As I mentioned I learned those lessons the hard way.SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
ELASTICSEARCH WITH FACETED NAVIGATION IN 15 ElasticSearch with faceted navigation in 15 minutes. ElasticUI is an awesome and very easy to setup framework that enables faceted navigation for ElasticSearch, written in AngularJS. WHEN SIMPLE IS BETTER: THE BOOLEAN SIMILARITY MODULE The most basic and simple form of a language model for a search engine is the boolean model. When you build up an inverted index, you need to collect all words that make up this index. And then you need to store the so called postings list, a list of all document ids where these words appear. For the boolean model you just store in a table FLASK FOR ELASTICSEARCH I have built a nice search-engine template using Python Flask in the backend and providing faceted navigation in the frontend. The frontendwas built using:
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen:SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
ELASTICSEARCH WITH FACETED NAVIGATION IN 15 ElasticSearch with faceted navigation in 15 minutes. ElasticUI is an awesome and very easy to setup framework that enables faceted navigation for ElasticSearch, written in AngularJS. WHEN SIMPLE IS BETTER: THE BOOLEAN SIMILARITY MODULE The most basic and simple form of a language model for a search engine is the boolean model. When you build up an inverted index, you need to collect all words that make up this index. And then you need to store the so called postings list, a list of all document ids where these words appear. For the boolean model you just store in a table FLASK FOR ELASTICSEARCH I have built a nice search-engine template using Python Flask in the backend and providing faceted navigation in the frontend. The frontendwas built using:
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
ABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more CONTACT - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SASKIA VOLA
Within the european research project ROBUST (Risk and Opportunity management of huge-scale BUSiness communiTy cooperation) which was conducted a few years ago by several universities and companies such as IBM and SAP, some interesting research has been conducted. TEXT CLASSIFICATION MADE EASY WITH ELASTICSEARCH Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch. With a background in computational linguistics and several years of WHY I DECIDED TO BECOME A FREELANCER Why I decided to become a freelancer Back in early 2014 I just had quit my job at a startup because they had pivoted from semantic technologies to something that I couldn’t relate to anymore. So I applied for new jobs at larger companies. I Read more HOW TO SPEED UP INDEXING INTO ELASTICSEARCH Exactly. It’s not very long. The more data you want to squeeze in before the index is refreshed and a new segment will be written the longer this refresh interval should be. For write-heavy usecases 30 seconds or 60 seconds is a better refresh interval. If you want to know more about the anatomy of a indexing operation, have a look here. NUTCH 2.2 WITH ELASTICSEARCH 1.X AND HBASE This document describes how to install and run Nutch 2.2.1 with HBase 0.90.4 and ElasticSearch 1.1.1 on Ubuntu 14.04 Prerequisites Make sure you installed the Java-SDK 7. $ sudo apt-get install openjdk-7-jdk And you set JAVA_HOME in your .bashrc: Add the following Read more HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with your website. TEST-DRIVEN RELEVANCE TUNING OF ELASTICSEARCH USING THE This blog post is written for engineers that are always looking for ways to improve the result sets of their search application built on Elasticsearch. The goal of this post is to raise awareness of why you should care about relevance, what components are involved Read moreSASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
ELASTICSEARCH WITH FACETED NAVIGATION IN 15 ElasticSearch with faceted navigation in 15 minutes. ElasticUI is an awesome and very easy to setup framework that enables faceted navigation for ElasticSearch, written in AngularJS. WHEN SIMPLE IS BETTER: THE BOOLEAN SIMILARITY MODULE The most basic and simple form of a language model for a search engine is the boolean model. When you build up an inverted index, you need to collect all words that make up this index. And then you need to store the so called postings list, a list of all document ids where these words appear. For the boolean model you just store in a table FLASK FOR ELASTICSEARCH I have built a nice search-engine template using Python Flask in the backend and providing faceted navigation in the frontend. The frontendwas built using:
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen:SASKIA VOLA
Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch.With a
SERVICES - SASKIA VOLA Elasticsearch consulting What can I do for you? Relevance tuning Datamodeling Setup your cluster Review your cluster setup Performance tuning (Read and Write) Queries and aggregations Textmining Consulting What can I do for you? Information Extraction Classification Clustering Matching Scraping How I work If you’re Read moreABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
TECHNOLOGY - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SIMPLE METRICS FOR TEXTMINING A simple metric for measuring the informativity in a given text is the relative amount of content words to non-content words. Content words are nouns, proper nouns, verbs, and adjectives. Some definitions include adverbs and some prepositions, but a test showed that those were not useful. The content function ratio (CFR) is calculated likethis:
ELASTICSEARCH WITH FACETED NAVIGATION IN 15 ElasticSearch with faceted navigation in 15 minutes. ElasticUI is an awesome and very easy to setup framework that enables faceted navigation for ElasticSearch, written in AngularJS. WHEN SIMPLE IS BETTER: THE BOOLEAN SIMILARITY MODULE The most basic and simple form of a language model for a search engine is the boolean model. When you build up an inverted index, you need to collect all words that make up this index. And then you need to store the so called postings list, a list of all document ids where these words appear. For the boolean model you just store in a table FLASK FOR ELASTICSEARCH I have built a nice search-engine template using Python Flask in the backend and providing faceted navigation in the frontend. The frontendwas built using:
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: PROJECTS - SASKIA VOLA Imprint Extraction The aim of this project was to extract addresses, phone numbers, e-mail addresses and names from German company websites. Imprints are a semi-sructured source of information. It was necessary to create a framework that used dictionaries, regularexpressions, a
ABOUT - SASKIA VOLA
After joining Elastic as a Consulting Engineer where I specialized in Search Consulting in 2018 for slightly over a year I decided to go back to freelancing. I’m very grateful for all that I have learned there and all the experience I gained while helping Read more CONTACT - SASKIA VOLA Type your search terms above and press return to see the searchresults.
SASKIA VOLA
Within the european research project ROBUST (Risk and Opportunity management of huge-scale BUSiness communiTy cooperation) which was conducted a few years ago by several universities and companies such as IBM and SAP, some interesting research has been conducted. TEXT CLASSIFICATION MADE EASY WITH ELASTICSEARCH Text Classification made easy with Elasticsearch. Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch. With a background in computational linguistics and several years of WHY I DECIDED TO BECOME A FREELANCER Why I decided to become a freelancer Back in early 2014 I just had quit my job at a startup because they had pivoted from semantic technologies to something that I couldn’t relate to anymore. So I applied for new jobs at larger companies. I Read more HOW TO SPEED UP INDEXING INTO ELASTICSEARCH Exactly. It’s not very long. The more data you want to squeeze in before the index is refreshed and a new segment will be written the longer this refresh interval should be. For write-heavy usecases 30 seconds or 60 seconds is a better refresh interval. If you want to know more about the anatomy of a indexing operation, have a look here. NUTCH 2.2 WITH ELASTICSEARCH 1.X AND HBASE This document describes how to install and run Nutch 2.2.1 with HBase 0.90.4 and ElasticSearch 1.1.1 on Ubuntu 14.04 Prerequisites Make sure you installed the Java-SDK 7. $ sudo apt-get install openjdk-7-jdk And you set JAVA_HOME in your .bashrc: Add the following Read more HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with your website. TEST-DRIVEN RELEVANCE TUNING OF ELASTICSEARCH USING THE This blog post is written for engineers that are always looking for ways to improve the result sets of their search application built on Elasticsearch. The goal of this post is to raise awareness of why you should care about relevance, what components are involved Read moreClose Search
Type your search terms above and press return to see the searchresults.
SASKIA VOLA
TEXTMINING, NLP AND ELASTICSEARCH CONSULTINGMenu
* Home
* Services
* Technology
* Projects
* About
* Contact
Search
2019-06-04
comment 0
HOW TO SPEED UP INDEXING INTO ELASTICSEARCH There are in general 2 different scenarios when it comes to indexing. Either you have to deal with a stream of data, like logs, Twitter Stream, newsfeeds etc. or you have nightly database dumps. There might be cases where you have both nightly database dumps… Read more Filed under: Elasticsearch2019-04-23
comment 1
NAMED ENTITY ANNOTATIONS IN ELASTICSEARCH This blogpost will show how you can use Elasticsearch to extract Named Entities and store them as annotations. There is a really nice plugin written by one of the main Elasticsearch developers Alexander Reelsen: https://github.com/spinscale/elasticsearch-ingest-opennlp This plugin wraps the library OpenNLP and allows to extract… Read more Filed under: Elasticsearch, Textmining
2019-04-15
comment 0
WHEN SIMPLE IS BETTER: THE BOOLEAN SIMILARITY MODULE I had a lecture about Information Retrieval at university. That’s the field that studies search engines. In the first few classes we learned about the history and evolution of language models that are used for search engines. The most basic and simple form of a… Readmore
Filed under: Elasticsearch, Ranking
2019-04-01
comments 2
HOW TO BUILD A SELF-LEARNING SEARCH ENGINE WITH ELASTICSEARCH This blogpost will walk you through a demo that shows how you can use Elasticsearch to build a self-learning search engine. You can apply this technique if you have a user facing UI and if you can access the webanalytics that tracks the user-interaction with… Read more Filed under: Elasticsearch, Ranking
2018-08-15
comment 0
TEST-DRIVEN RELEVANCE TUNING OF ELASTICSEARCH USING THE RANKINGEVALUATION API
This blog post is written for engineers that are always looking for ways to improve the result sets of their search application built on Elasticsearch. The goal of this post is to raise awareness of why you should care about relevance, what components are involved… Read more Filed under: Elasticsearch, Ranking
2017-05-24
comment 0
HOW TO USE ELASTICSEARCH FOR NATURAL LANGUAGE PROCESSING AND TEXTMINING — PART 2
Welcome to Part 2 of How to use Elasticsearch for Natural Language Processing and Text Mining. It’s been some time since Part 1, so you might want to brush up on the basics before getting started. This time we’ll focus on one very important type… Read more Filed under: Elasticsearch, Textmining
2017-05-02
comment 0
TEXT CLASSIFICATION MADE EASY WITH ELASTICSEARCH Elasticsearch is widely used as a search and analytics engine. Its capabilities as a text mining API are not as well known. In the following article I’d like to show how text classification can be done with Elasticsearch. With a background in computational linguistics and… Read more Filed under: Elasticsearch, Textmining
2016-12-30
comment 0
HOW TO USE ELASTICSEARCH FOR NATURAL LANGUAGE PROCESSING AND TEXT MINING — PART 1 ElasticSearch is a search engine and an analytics platform. But it offers many features that are useful for standard Natural Language Processing and Text Mining tasks. Read more… Filed under: Elasticsearch, Textmining
2016-12-02
comment 0
WHY I DECIDED TO BECOME A FREELANCER Why I decided to become a freelancer Back in early 2014 I just had quit my job at a startup because they had pivoted from semantic technologies to something that I couldn’t relate to anymore. So I applied for new jobs at larger companies. I… Read more Filed under: Uncategorized2015-11-29
comment 0
STATISTICAL AGGREGATIONS ON NUMERIC OBJECT ARRAY FIELDS When working with statistical aggregations in ElasticSearch 1.7 I couldn’t find any documentation about how arrays are treated. Of course you need a numeric field for statistical aggregations. In my special case I needed arrays of objects. But this should obviously not make a difference.… Read more Filed under: Elasticsearch« Older Posts
* GitHub
Details
Copyright © 2024 ArchiveBay.com. All rights reserved. Terms of Use | Privacy Policy | DMCA | 2021 | Feedback | Advertising | RSS 2.0