Toots for johnmackintosh@fosstodon.org account

Written by John MacKintosh on 2025-01-14 at 21:19

Didnt think I had anything in common with Mike Portnoy but this was my first time hearing one of her songs also

Mike PortnoyHears Taylor Swift For The First Time

https://youtube.com/watch?v=cHl_gsd0OR0

You don't have to be into drums to appreciate this series, it's one of the best things on the internet.

His first "off-the-cuff" take is outstanding

=> More informations about this toot | View the thread

Written by John MacKintosh on 2025-01-01 at 19:20

Current status : resisting the urge to install R on a recently liberated raspberry pi.

Going to use it for general Linux learning instead (should have added that to my latest blog post..)

=> More informations about this toot | View the thread

Written by John MacKintosh on 2025-01-01 at 12:36

[#]rstats

Let the header image speak for itself:

https://johnmackintosh.net/blog/2025-01-01-once-more/

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-12-23 at 00:58

While this was a bit of a faff (until I figured out the way forward), I will always take a bunch of text files containing all the data over some godforsaken table builder or anything to do with SPARQL.

"Here's some open data"

What format is it in?

"Terrible"

How do I access it?

"Register for API access. Mix one part of extract of dill with morning dew immediately following the equinox. Change your name to Keith. Make 29 selections on this shiny app. Defeat Medusa. Download as JSON. Cry a lot."

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-12-22 at 21:18

Irregular untidy CSV files hold no fear for me

https://johnmackintosh.net/blog/rstats/2024-12-22-tidying-text-files/

(Early draft, liable to update if I can be bothered. Code on GitHub)

[#]RStats

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-12-20 at 18:02

Think I've cracked it lads.

Edge cases have been addressed, and 70 tidy (and massive) .tsv files are now in place.

Next stop, duckdb and / or parquet

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-12-20 at 08:04

[#]rstats

Strategies for dealing with tidying multiple large CSV files, each of varying dimensions, which are in a list.

They all have the first 4 rows of useless text. Varying column widths.

The next several rows (could be one, could be four) are what should be column headers. No way of knowing how many there are without painstakingly going through each.

The last 6 rows are useless, and can be discarded.

I have a hacky solution but interested to hear how others would start to tackle this

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-12-20 at 07:47

[#]rstats

Recommend me your go-to resources for working with (deeply) nested lists / list columns and purrr, that aren't R4DS or Advanced R.

Thank you

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-12-02 at 17:23

[#]rstats

Late, but #AdventOfCode day 1,with #rdatatable

=> View attached media

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-11-22 at 00:04

[#]rstats

https://github.com/johnmackintosh/cusumcharter

It's taken 3 years, but {cusumcharter} is on the brink of 10K CRAN downloads.

I know I've used it, and one other person got in touch about it when it first hit CRAN, but it's been radio silence otherwise.

I developed it really quickly - it was on CRAN within a week of getting the idea.

It also passed CRAN checks first time.

For that alone, it has a special place in my heart.

I may get round to tidying it up and doing a new release, no promises though.

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-11-04 at 20:55

[#]rstats

duckdb, duckplyr, data.table and purrr is one heck of a combination, just so you know

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-11-04 at 17:30

[#]rstats misspelled "comorbidity" as "combordity" and was wrongly annoyed at my rubbish purrr skills

Fixed the typo and everything worked as would be expected if it was written by someone vaguely competent

=> View attached media

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-10-27 at 14:24

I am thinking of creating a package of helper functions for #rdatatable that makes things like DT[, .N, by][order(-N)] and other common actions a lot easier, and which also works with the new programming interface.

That particular code for descending sort is something I've written hundreds of times, and I'm fed up of it. Wrapping it into a simple function has been a real boon in my latest project.

If anyone thinks this package may be a good idea, let me know (somehow) on this post #RStats

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-10-24 at 06:57

[#]RStats

Was wondering why my notifications were going nuts over on the corporate BS site:

Yan Holtz's data-to-viz site is a great resource, and I'm always pleased to see the reaction this plot gets.

https://www.linkedin.com/posts/yan-holtz-2477534a_dataviz-activity-7254853100564328449-BP9z

Incidentally, a "hot-take" in the replies was that this could still have been a line chart, but I cannot see, for this level of detail, how on earth that could have worked?

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-10-17 at 15:50

[#]rstats absolutely love the fact that a colleague can mention a statistical technique / public health method I've never heard of but I can Google it + "r package" and get several results.

Even better when one of them is powered by rdatatable and {checkmate} , was updated very recently, and is really straightforward to use with a {pkgdown} website

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-10-09 at 16:05

[#]rstats

Spent some time working with {duckplyr} and {arrow} to save population time series data as parquet files.

Wrote a generalised importing function so I can filter out what I need.

So if I only need 48K rows out of 390K, I only now load the rows I need, and the rest can stay untouched.

I'm pretty much sold on this file format already.

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-09-20 at 20:34

Looks pretty good , a tidy #rstats "solver"

I love stuff like this, but never seem to get the time to figure out how to apply them to real life NHS problems.

https://github.com/colin-fraser/tidyLP

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-09-13 at 13:09

[#]rstats

Footering about with duckdb.

Discovered I can wrap setDT() around the whole tbl(con,data) |> wrangling thing, collect it and do some #rdatatable goodness, using some custom helper functions for some very common actions

It does mean a mix of old pipe and new pipe but I don't care about that so much.

Wish I'd experimented with this ages ago.

Edit: it's faster to wrap it in setDT too, rather than collect () then setDT on result. Didn't expect that TBH

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-09-12 at 07:57

LinkedIn:

"🎵You've got an🎵 unread notification🎵"

=> View attached media

=> More informations about this toot | View the thread

Written by John MacKintosh on 2024-09-08 at 21:43

[#]RStats

Flashback to the time I wrote a function called "super-hands" - "for when your data is a bit more-ish"

(This only makes sense if you've seen Peep Show , and even then...it's pushing things a bit).

I did give it a more sensible name later on, honest

=> View attached media

=> More informations about this toot | View the thread

=> This profile with reblog | Go to johnmackintosh@fosstodon.org account

Proxy Information

Original URL: gemini://mastogem.picasoft.net/profile/329375
Status Code: Success (20)
Meta: text/gemini
Capsule Response Time: 357.423787 milliseconds
Gemini-to-HTML Time: 9.614126 milliseconds

This content has been proxied by September (3851b).