Hello experts, I was just wondering if I could leverage the below thing to expedite the loading of the data process in Spark.
def extract_data_from_mongodb(mongo_config): df = glueContext.create_dynamic_frame.from_options( connection_type="mongodb", connection_options=mongo_config ) return df mongo_config = { "connection.uri": "mongodb://url", "database": "", "collection": "", "username": "", "password": "", "partitionColumn":"_id", "lowerBound": str(lower_bound), "upperBound": str(upper_bound) } lower_bound = 0 upper_bound = 200 segment_size = 10 segments = [(i, min(i + segment_size, upper_bound)) for i in range(lower_bound, upper_bound, segment_size)] with ThreadPoolExecutor() as executor: futures = [executor.submit(execution, segment) for segment in segments] for future in as_completed(futures): try: future.result() except Exception as e: print(f"Error: {e}") I am trying to leverage the parallel threads to pull data in parallel. So is it effective?