How would one theoretically use #scraping from different sites of #events organizers and generate an #icalendar file to easily get notified of events in their city or region for themselves or their community?
This is to avoid using #meta as a source that many rely on for lack of alternatives (that are actually invested).
[#]webscraping #quitmeta
=> More informations about this toot | More toots from dobody@mastodon.design
@dobody very interested in this - mostly the data acquisition part. (imho, once you have the data, generating ways to deliver that -- such as rss feeds, ical, etc should be easy-ish). 📆
i run a @gancio instance as a calendar of local events -- which has many ways to export/follow. however, getting data IN is the hard part. (i just do it manually when i stumble on something. very difficult to get others to add to it at this point. 😭 )
=> More informations about this toot | More toots from Mumonkan@mastodon.online
@Mumonkan @gancio I jist took a detour to see how it would be done and depending on the site that hosts the event data, you can scrape inside specific classes.
But that means... sigh a different scraping config for every new event site added...
Agreed, generating the #icalendar data structure will be the actually easy part
=> More informations about this toot | More toots from dobody@mastodon.design
@dobody facebook (which arguable has decently structured events) allows you to export your own events in ical format. i have a little script i wrote which cleans this up a bit, but it is fairly usable. however, i am not aware of a public url (e.g. all events for a given location) out of facebook.
=> More informations about this toot | More toots from Mumonkan@mastodon.online
@Mumonkan @dobody currently on gancio you can enter events from the website, or importing from ics (only one event at a time), or from the API (https://gancio.org/dev/api#add-a-new-event) or following someone from the fediverse (since the last version in addition to supporting other instances of gancio, you can also import events from wordpress, probably also from mobilizon groups but not tried yet).
I would like to implement an automation to import from ics, probably would be better via a plugin?
=> More informations about this toot | More toots from gancio@mastodon.cisti.org
@gancio ics import would be pretty cool. especially if it is of the "subscribed" variety -- that is, not just a one-time grab of an ics url, but rather gancio would keep checking the url for new events.
there is a local bike group who uses google calendar. i would love to have gancio subscribe to that url, so that when they add a new event, it pops up on mine.
(still tho the biggest problem is useless events people post to instagram as images. 😵 sighhhh)
=> More informations about this toot | More toots from Mumonkan@mastodon.online
@dobody for each event page to be watched I would curl it, filter the calendar part in the dom using htmlq and diff with the previous snapshot. Returns of diff are to be added to the icalendar
=> More informations about this toot | More toots from clemtre@mastodon.social
@clemtre i'm thinking of straightup scraping the page with a dedicated library, maybe make it uniform through any type of data file, and then filter out the diffs. If i filter right after curl some data might falsely appear to have changed (imagine html elements with ID or classes that number their order of appearance - every time a new event is adsed the older ones might change classes and appear as different)
It's gonna be a lot of work anyway.
=> More informations about this toot | More toots from dobody@mastodon.design This content has been proxied by September (3851b).Proxy Information
text/gemini