Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Population Map example #175

Open
elGringo11 opened this issue Nov 6, 2023 · 9 comments
Open

Population Map example #175

elGringo11 opened this issue Nov 6, 2023 · 9 comments

Comments

@elGringo11
Copy link

Hello have tried to run the example Population Map and got a bunch of errors:
PS C:\Users\XXX\Desktop\INSEE_API> & "C:/Program Files/Python311/python.exe" c:/Users/XXX/Desktop/INSEE_API/carte.py
API query number limit reached - function might be slowed down
Thanks for your help.

@hadrilec
Copy link
Contributor

hadrilec commented Nov 6, 2023

hello, thanks for your feedback, could you please provide a reproducible example in this issue?

@elGringo11
Copy link
Author

Hello
Is that u expect?
Thanks


from pynsee.utils.init_conn import init_conn
init_conn(insee_key="XXX", insee_secret="XXX")


from pynsee.geodata import get_geodata_list, get_geodata, GeoFrDataFrame


import math
import geopandas as gpd
import pandas as pd
from pandas.api.types import CategoricalDtype
import matplotlib.cm as cm
import matplotlib.pyplot as plt
import descartes

import warnings
from shapely.errors import ShapelyDeprecationWarning
warnings.filterwarnings("ignore", category=ShapelyDeprecationWarning)

import logging
import sys

logger = logging.getLogger()
logger.setLevel(logging.INFO)
formatter = logging.Formatter('[%(filename)s:%(lineno)s - %(funcName)20s() ] %(message)s')

file_handler = logging.FileHandler('mylogs.log')
file_handler.setLevel(logging.DEBUG)
file_handler.setFormatter(formatter)

logger.addHandler(file_handler)


# get geographical data list
geodata_list = get_geodata_list()
# get departments geographical limits
com = get_geodata('ADMINEXPRESS-COG-CARTO.LATEST:commune')

mapcom = gpd.GeoDataFrame(com).set_crs("EPSG:3857")

mapcom = mapcom.to_crs(epsg=3035)
mapcom["area"] = mapcom['geometry'].area / 10**6
mapcom = mapcom.to_crs(epsg=3857)

mapcom['REF_AREA'] = 'D' + mapcom['insee_dep']
mapcom['density'] = mapcom['population'] / mapcom['area']

mapcom = GeoFrDataFrame(mapcom)
mapcom = mapcom.translate(departement = ['971', '972', '974', '973', '976'],
                          factor = [1.5, 1.5, 1.5, 0.35, 1.5])

mapcom = mapcom.zoom(departement = ["75","92", "93", "91", "77", "78", "95", "94"],
                 factor=1.5, startAngle = math.pi * (1 - 3 * 1/9))
mapcom

mapplot = gpd.GeoDataFrame(mapcom)
mapplot.loc[mapplot.density < 40, 'range'] = "< 40"
mapplot.loc[mapplot.density >= 20000, 'range'] = "> 20 000"

density_ranges = [40, 80, 100, 120, 150, 200, 250, 400, 600, 1000, 2000, 5000, 10000, 20000]
list_ranges = []
list_ranges.append( "< 40")

for i in range(len(density_ranges)-1):
    min_range = density_ranges[i]
    max_range = density_ranges[i+1]
    range_string = "[{}, {}[".format(min_range, max_range)
    mapplot.loc[(mapplot.density >= min_range) & (mapplot.density < max_range), 'range'] = range_string
    list_ranges.append(range_string)

list_ranges.append("> 20 000")

mapplot['range'] = mapplot['range'].astype(CategoricalDtype(categories=list_ranges, ordered=True))

fig, ax = plt.subplots(1,1,figsize=[15,15])
mapplot.plot(column='range', cmap=cm.viridis,
legend=True, ax=ax,
legend_kwds={'bbox_to_anchor': (1.1, 0.8),
             'title':'density per km2'})
ax.set_axis_off()
ax.set(title='Distribution of population in France')
plt.show()

fig.savefig('pop_france.svg',
            format='svg', dpi=1200,
            bbox_inches = 'tight',
            pad_inches = 0)

@hadrilec
Copy link
Contributor

hadrilec commented Nov 6, 2023

ok thanks, I will have a look once I am back from holidays at the end of November.
Do you manage to get the map of France?
If the only warning you get, is "Api slowed down", it is fine, otherwise it might be a bug.

@elGringo11
Copy link
Author

Ok. I got a long range of errors. I picked up this one. it may help you.
Have nice holidays

RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.  File "<string>", line 1, in <module>

  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 129, in _main
    prepare(preparation_data)
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 240, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 291, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "c:\Users\XXX\Desktop\INSEE_API\carte.py", line 38, in <module>
    com = get_geodata('ADMINEXPRESS-COG-CARTO.LATEST:commune')
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\AppData\Roaming\Python\Python311\site-packages\pynsee\geodata\get_geodata.py", line 31, in get_geodata
    df = _get_geodata(id=id, update=update, crs=crs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\AppData\Roaming\Python\Python311\site-packages\pynsee\geodata\_get_geodata.py", line 173, in _get_geodata
    with multiprocessing.Pool(
         ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 215, in __init__
    self._repopulate_pool()
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 306, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
    w.start()
  File "C:\Program Files\Python311\Lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\context.py", line 336, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 158, in get_preparation_data
    _check_not_importing_main()
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 138, in _check_not_importing_main
    raise RuntimeError('''
RuntimeError:
        An attempt has been made to start a new process before the
        current process has finished its bootstrapping phase.

        This probably means that you are not using fork to start your
        child processes and you have forgotten to use the proper idiom
        in the main module:

            if __name__ == '__main__':
                freeze_support()
                ...

        The "freeze_support()" line can be omitted if the program
        is not going to be frozen to produce an executable.
API query number limit reached - function might be slowed down
API query number limit reached - function might be slowed down
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 120, in spawn_main
    exitcode = _main(fd, parent_sentinel)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 129, in _main
    prepare(preparation_data)
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 240, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "C:\Program Files\Python311\Lib\multiprocessing\spawn.py", line 291, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
                   ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen runpy>", line 291, in run_path
  File "<frozen runpy>", line 98, in _run_module_code
  File "<frozen runpy>", line 88, in _run_code
  File "c:\Users\XXX\Desktop\INSEE_API\carte.py", line 38, in <module>
    com = get_geodata('ADMINEXPRESS-COG-CARTO.LATEST:commune')
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\AppData\Roaming\Python\Python311\site-packages\pynsee\geodata\get_geodata.py", line 31, in get_geodata
    df = _get_geodata(id=id, update=update, crs=crs)
         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\XXX\AppData\Roaming\Python\Python311\site-packages\pynsee\geodata\_get_geodata.py", line 173, in _get_geodata
    with multiprocessing.Pool(
         ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\context.py", line 119, in Pool
    return Pool(processes, initializer, initargs, maxtasksperchild,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 215, in __init__
    self._repopulate_pool()
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 306, in _repopulate_pool
    return self._repopulate_pool_static(self._ctx, self.Process,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\pool.py", line 329, in _repopulate_pool_static
    w.start()
  File "C:\Program Files\Python311\Lib\multiprocessing\process.py", line 121, in start
    self._popen = self._Popen(self)
                  ^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\context.py", line 336, in _Popen
    return Popen(process_obj)
           ^^^^^^^^^^^^^^^^^^
  File "C:\Program Files\Python311\Lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
    prep_data = spawn.get_preparation_data(process_obj._name)

@hadrilec
Copy link
Contributor

hi, in the PR #162 I made this commit 68d4c0e, it should act as a backup in case the multiprocessing used to retrieve geodata fails
I hope we can merge the PR in the coming days, and that it would be final fix to the issue you raised

@elGringo11
Copy link
Author

elGringo11 commented Mar 26, 2024 via email

@hadrilec
Copy link
Contributor

hello,
you can browse this app to find easily insee series: https://graffiti.lab.sspcloud.fr/
you should click on the tab, graphique a la demande/plot yourself.
otherwise, I guess it should be in the dataset COMPTES-ETAT

@elGringo11
Copy link
Author

elGringo11 commented Mar 26, 2024 via email

@hadrilec
Copy link
Contributor

ok maybe you can have a look at this:https://github.com/hadrilec/financial_market_report/blob/master/code/EU_gov_debt_interest.R
or you should send an email to INSEE asking for the correct IDBANK series.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants