Digifloat

Harnessing Elasticsearch with Python for Efficient Data Indexing and Searching

Introduction:

Elasticsearch, a distributed search and analytics engine, coupled with Python, offers a powerful solution for indexing and searching large volumes of data efficiently. In this comprehensive guide, we’ll explore how to leverage the Elasticsearch Python client to connect to both Elasticsearch Cloud and local instances, create indices, define custom mappings, ingest data from a CSV file, and perform queries. We’ll illustrate each step with practical examples, demonstrating how Elasticsearch can seamlessly integrate into Python applications.

 

Connecting to Elasticsearch:

The first step in working with Elasticsearch is establishing a connection. Using the Elasticsearch Python client, we can connect to both Elasticsearch Cloud and local instances effortlessly.

				
					rom elasticsearch import Elasticsearch

ENDPOINT="http://localhost:9200"

## to conenct via username password use this
#es = Elasticsearch(hosts=[ENDPOINT],  http_auth=(USERNAME, PASSWORD))
es = Elasticsearch(hosts=[ENDPOINT])

#checking if elastic search is connected 
es.ping()
				
			

Creating an Index and Custom Mappings:

Indices are containers for storing documents in Elasticsearch, and mappings define the structure of these documents. Let’s create an index named “my_index” and define a custom mapping.

				
					#Index Schema

indexMapping={
    "properties":{
        "id":{
            "type":"long"
        },
        "Book":{
            "type":"keyword"
        },
        "Page_No":{
            "type":"long"
        },
        "Part":{
            "type":"text"
        },
        "Chapter":{
            "type":"text"
        },
        "Sub_Chapter":{
            "type":"text"
        },
        "Article_No":{
            "type":"keyword"
        },
        "Clause_No":{
            "type":"keyword"
        }
        "Text_vector":{
            "type":"dense_vector",
            "dims":768,
            "index":True,
            "similarity":"l2_norm"
        }        
    }
}
				
			
# create the index and apply the custom mappings
				
					es.indices.create(index=index_name,settings=indexSettings, mappings=indexMapping)
				
			
Ingesting Data from CSV: Elasticsearch allows easy ingestion of structured data from various sources. Let’s demonstrate how to ingest data from a CSV file into our index.
				
					import pandas as pd

df=pd.read_csv("constitution.csv", index_col=False)
df.rename(columns={'Unnamed: 0': 'id'}, inplace=True)

df.head()
				
			
After reading the csv into pandas we will add the dataframe to elastic index using
				
					record_list=df.to_dict("records")

#pushing data to elastic search index

for record in record_list:
    try:
        es.index(index=index_name,document=record)
    except Exception  as e:
        print(e)
				
			
Querying Data: With the data ingested, we can now perform various types of queries to retrieve relevant documents.
				
					ook="Anti Terrorism Act 1997"

# Execute the search
result = es.search(index=index_name, query = {
        "match": {
            "Book": book}
    	})

print(result)
# Print the results
for hit in result['hits']['hits']:
    print(hit['_source']['Textual_Metadata'])
				
			

You can write and design queries according to your needs and data

Conclusion:
In this blog post, we’ve explored the process of utilizing Elasticsearch with Python for efficient data indexing and searching. We’ve covered connecting to Elasticsearch instances, creating indices with custom mappings, ingesting data from CSV files, and executing queries to retrieve relevant information.

By harnessing the power of Elasticsearch alongside Python, developers can build robust search functionalities into their applications, enabling seamless integration with structured data sources like CSV files. Whether working with Elasticsearch Cloud or local instances, the Elasticsearch Python client provides a versatile and intuitive interface, empowering developers to leverage Elasticsearch’s capabilities effectively.

Rawaha Javed

Associate Consultant

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top