Ancestors

Toot

Written by Dobody on 2025-01-23 at 23:53

How would one theoretically use #scraping from different sites of #events organizers and generate an #icalendar file to easily get notified of events in their city or region for themselves or their community?

This is to avoid using #meta as a source that many rely on for lack of alternatives (that are actually invested).

[#]webscraping #quitmeta

=> More informations about this toot | More toots from dobody@mastodon.design

Descendants

Written by Mumonkan on 2025-01-24 at 00:01

@dobody very interested in this - mostly the data acquisition part. (imho, once you have the data, generating ways to deliver that -- such as rss feeds, ical, etc should be easy-ish). 📆

i run a @gancio instance as a calendar of local events -- which has many ways to export/follow. however, getting data IN is the hard part. (i just do it manually when i stumble on something. very difficult to get others to add to it at this point. 😭 )

=> More informations about this toot | More toots from Mumonkan@mastodon.online

Written by Dobody on 2025-01-24 at 00:37

@Mumonkan @gancio I jist took a detour to see how it would be done and depending on the site that hosts the event data, you can scrape inside specific classes.

But that means... sigh a different scraping config for every new event site added...

Agreed, generating the #icalendar data structure will be the actually easy part

=> More informations about this toot | More toots from dobody@mastodon.design

Written by Mumonkan on 2025-01-24 at 02:25

@dobody facebook (which arguable has decently structured events) allows you to export your own events in ical format. i have a little script i wrote which cleans this up a bit, but it is fairly usable. however, i am not aware of a public url (e.g. all events for a given location) out of facebook.

=> More informations about this toot | More toots from Mumonkan@mastodon.online

Written by Gancio on 2025-01-24 at 13:25

@Mumonkan @dobody currently on gancio you can enter events from the website, or importing from ics (only one event at a time), or from the API (https://gancio.org/dev/api#add-a-new-event) or following someone from the fediverse (since the last version in addition to supporting other instances of gancio, you can also import events from wordpress, probably also from mobilizon groups but not tried yet).

I would like to implement an automation to import from ics, probably would be better via a plugin?

=> More informations about this toot | More toots from gancio@mastodon.cisti.org

Written by Mumonkan on 2025-01-24 at 17:52

@gancio ics import would be pretty cool. especially if it is of the "subscribed" variety -- that is, not just a one-time grab of an ics url, but rather gancio would keep checking the url for new events.

there is a local bike group who uses google calendar. i would love to have gancio subscribe to that url, so that when they add a new event, it pops up on mine.

(still tho the biggest problem is useless events people post to instagram as images. 😵 sighhhh)

=> More informations about this toot | More toots from Mumonkan@mastodon.online

Written by martlem on 2025-01-24 at 00:13

@dobody for each event page to be watched I would curl it, filter the calendar part in the dom using htmlq and diff with the previous snapshot. Returns of diff are to be added to the icalendar

=> More informations about this toot | More toots from clemtre@mastodon.social

Written by Dobody on 2025-01-24 at 00:42

@clemtre i'm thinking of straightup scraping the page with a dedicated library, maybe make it uniform through any type of data file, and then filter out the diffs. If i filter right after curl some data might falsely appear to have changed (imagine html elements with ID or classes that number their order of appearance - every time a new event is adsed the older ones might change classes and appear as different)

It's gonna be a lot of work anyway.

=> More informations about this toot | More toots from dobody@mastodon.design

Proxy Information
Original URL
gemini://mastogem.picasoft.net/thread/113880361523028994
Status Code
Success (20)
Meta
text/gemini
Capsule Response Time
289.063958 milliseconds
Gemini-to-HTML Time
12.064208 milliseconds

This content has been proxied by September (3851b).