You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am facing a problem and I need help from someone having experience in arctic.
I tried first antarctic to store Pandas dataframes but it stores a dataframe in a single document. However, errors are generated because of 16MB document limitation problem.
I think that arctic will solve my problem but I don't know what Store to use for my use case.
So, here is my use case:
Every 3 days I run a program that creates 3 Pandas dataframes for multiple projects.
So, each project has 3 dataframes.
In a second program a user selects one project and I want to have access to its 3 dataframes.
@bmoscon I can convert a dataframe to json documents and store them in a collection but I don't know if this performs fast when reading thousands of documents. I might need to check it.
Hi all!
I am facing a problem and I need help from someone having experience in arctic.
I tried first antarctic to store Pandas dataframes but it stores a dataframe in a single document. However, errors are generated because of 16MB document limitation problem.
I think that arctic will solve my problem but I don't know what Store to use for my use case.
So, here is my use case:
Every 3 days I run a program that creates 3 Pandas dataframes for multiple projects.
So, each project has 3 dataframes.
In a second program a user selects one project and I want to have access to its 3 dataframes.
What Store to use? Is this a correct usage? :
db = Arctic('localhost')
db.initialize_library('projects')
projects_library = db['projects']
projects_library.write('project-1-dataframe-1', df1, metadata={'run_date': date1})
projects_library.write('project-1-dataframe-2', df2, metadata={'run_date': date1})
projects_library.write('project-1-dataframe-3', df3, metadata={'run_date': date1})
projects_library.write('project-2-dataframe-1', df4, metadata={'run_date': date2})
projects_library.write('project-2-dataframe-2', df5, metadata={'run_date': date2})
projects_library.write('project-2-dataframe-3', df6, metadata={'run_date': date2})
project1_df1 = finance_library.read('project-1-dataframe-1').data
Also, I don't think that I will need access to data of previous runs. Only I need latest data of a project.
How can I do it optimally?
Thank you!
The text was updated successfully, but these errors were encountered: