Skip to content

Creating labels is endless on big mbtiles #39

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Ashot-KR opened this issue Feb 7, 2018 · 4 comments
Closed

Creating labels is endless on big mbtiles #39

Ashot-KR opened this issue Feb 7, 2018 · 4 comments

Comments

@Ashot-KR
Copy link

Ashot-KR commented Feb 7, 2018

Tried to create labels on
Russia:

{
    "country": "russia",
    "bounding_box": [36.72353152233891, 55.3729665317045, 36.74095515209966, 55.38046857198313],
    "zoom": 12,
    "classes": [
        { "name": "Buildings", "filter": ["has", "building"] }
    ],
    "imagery": "http://a.tiles.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}.jpg?access_token=TOKEN",
    "background_ratio": 1,
    "ml_type": "classification"
}

and US:

{
    "country": "united_states_of_america",
    "bounding_box": [-99.79180928271484,32.42732216399054,-99.67628117602538,32.51812432193046],
    "zoom": 12,
    "classes": [
        { "name": "Buildings", "filter": ["has", "building"] }
    ],
    "imagery": "http://a.tiles.mapbox.com/v4/mapbox.satellite/{z}/{x}/{y}.jpg?access_token=TOKEN",
    "background_ratio": 1,
    "ml_type": "classification"
}

when running label-maker download process seems endless.
I started process for the whole night for russia, in the morning it was still active with no any results (geojson file has 0 bytes)

@McCulloughRT
Copy link
Contributor

McCulloughRT commented Feb 21, 2018

What you're experiencing is probably caused by running out of RAM. I just ran 'label-maker label' on the united_states_of_america.mbtiles file on an AWS m5.4xlarge (64GB of RAM). It peaked at about 61GB of usage but did finally finish running in a half hour or so.

Looking at the code, it appears that stream-filter.py is taking the output of tippecanoe-decode and instead of streaming it, buffering the entire GeoJSON into memory before processing it.

Quick fix is to change stream-filter.py line 8 to read:
for line in sys.stdin:
instead of:
for line in sys.stdin.readlines():

This will cause it to stream the lines instead of buffering them. I'm not sure if there are other implications of this given that you're relying on tippecanoe to output each feature on a single line... but it appears to make that assumption either way.

I'll submit a PR on it in the mean time. :)
#45

@drewbo
Copy link
Contributor

drewbo commented Mar 1, 2018

@Ashot-KR can you try version 0.2.1 to see if this still occurs? I pushed the above fix from @McCulloughRT as well as a download fix (#35) for large files.

@Ashot-KR
Copy link
Author

Ashot-KR commented Mar 1, 2018

thanks, i will try it in few day, when i have some free time

@Ashot-KR
Copy link
Author

Ashot-KR commented Mar 1, 2018

it took about 30 minutes, but it works, thanks!

@Ashot-KR Ashot-KR closed this as completed Mar 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants