Hi #mapstodon folks, quick question : do you have an easy way to get #OSM data into #parquet format?
I've been looking into #Python tools to deal with #pbf files and they have been getting better, but for filtering data real fast, it's just not comparable to parquet, so I'd like to also move my #OpenStreetMap data to that format.
Unfortunately, all I've seen so far is https://github.com/igor-suhorukov/openstreetmap_h3 and I'm not a fan of java...
If that's all there is, I'll use it, but if you know something else... 🙏
=> More informations about this toot | More toots from tfardet@scicomm.xyz
just adding as a reply that I ended up using gdal >= 3.8 with ogr2ogr -f parquet new.parquet old.osm.pbf <layer>
and it works great (thanks @jeremiahpslewis)
=> More informations about this toot | More toots from tfardet@scicomm.xyz
@tfardet I guess it depends how much coding you want to do. Google have protobuff libraries for a range of languages - I used C#, for example.
=> More informations about this toot | More toots from Winwaed@mastodon.online
@Winwaed
While I could have a look, I definitely do not have enough time at the moment, so I'm rather hoping some equivalent of osmium or a similar tool exists, that would also support writing to parquet
=> More informations about this toot | More toots from tfardet@scicomm.xyz
@tfardet "I do not have enough time" - I know the feeling lol
=> More informations about this toot | More toots from Winwaed@mastodon.online
@Winwaed yeah, someone got my hopes up that I won't have to dig into this thanks to gdal >= 3.8, fingers crossed, I'll check next week
=> More informations about this toot | More toots from tfardet@scicomm.xyz
@tfardet Hi, I use pyrosm which lets you load PBF data into GeoDataFrames: https://pyrosm.readthedocs.io/en/latest/ GDFs can be exported to parquet https://geopandas.org/en/stable/docs/reference/api/geopandas.GeoDataFrame.to_parquet.html Hope it helps!
=> More informations about this toot | More toots from ZorzalErrante@datasci.social
@ZorzalErrante
That is definitely a possibility (though I moved to pyogrio instead), but it requires loading the whole file to memory, which I was hoping to avoid.
Will probably resort to that if there is no way around it and the java tool proves problematic.
=> More informations about this toot | More toots from tfardet@scicomm.xyz This content has been proxied by September (ba2dc).Proxy Information
text/gemini