from:"Xavier Gervilla"

[window aggregate][debug] Rows not dropping with watermark and window

2022-04-27 Thread Xavier Gervilla

Hi team, With your help last week I was able to adapt a project I'm developing and apply a sentiment analysis and NER retrieval to streaming tweets. One of the next steps in order to ensure that memory doesn't collapse is applying windows and watermarks to discard tweets after some time. Howe

Re: [Spark Streaming] [Debug] Memory error when using NER model in Python

2022-04-20 Thread Xavier Gervilla

_df = df.select(explode_outer(map_keys(col(col_name.distinct() keys = list(map(lambda row: row[0], keys_df.collect())) key_cols = list(map(lambda f: col(col_name).getItem(f) .alias(str(col_name + sep + f)), keys)) drop_column_list = [col_nam

[Spark Streaming] [Debug] Memory error when using NER model in Python

2022-04-18 Thread Xavier Gervilla

Hi Team,https://stackoverflow.com/questions/71841814/is-there-a-way-to-prevent-excessive-ram-consumption-with-the-spark-configuration I'm developing a project that retrieves tweets on a 'host' app, streams them to Spark and with different operations with DataFrames obtains the Sentiment of th

[window aggregate][debug] Rows not dropping with watermark and window

Re: [Spark Streaming] [Debug] Memory error when using NER model in Python

[Spark Streaming] [Debug] Memory error when using NER model in Python

3 matches

Site Navigation

Mail list logo

Footer information