Ancestors

Written by Dan Goodman on 2025-01-23 at 17:08

What's the right way to think about modularity in the brain? This devilish 😈 question is a big part of my research now, and it started with this paper with @GabrielBena finally published after the first preprint in 2021!

https://www.nature.com/articles/s41467-024-55188-9

We know the brain is physically structured into distinct areas ("modules"?). We also know that some of these have specialised function. But is there a necessary connection between these two statements? What is the relationship - if any - between 'structural' and 'functional' modularity?

TLDR if you don't want to read the rest: there is no necessary relationship between the two, although when resources are tight, functional modularity is more likely to arise when there's structural modularity. We also found that functional modularity can change over time! Longer version follows.

[#]Neuroscience #CompNeuro #ComputationalNeuroscience

=> More informations about this toot | More toots from neuralreckoning@neuromatch.social

Written by Dan Goodman on 2025-01-23 at 17:09

@GabrielBena

It could be the case that they're entirely separate: functional modules that don't overlap the structural modules at all. We often look for spatial maps in the brain, but the existence of salt-and-pepper maps shows that the brain doesn't have to organise spatially.

It could be somewhere in between, with functional modules partially overlapping structural modules, which would explain why we can observe partial functional deficits after lesions to some but not all areas.

Or maybe the brain isn't constrained to have anything that we would recognise as functional modularity at all? We won't get too deeply into that possibility, but we did ask what are the features we would expect based on our intuitions about modularity?

=> View attached media

=> More informations about this toot | More toots from neuralreckoning@neuromatch.social

Written by Dan Goodman on 2025-01-23 at 17:10

@GabrielBena

Some of these come from Fodor, and later Shallice and Cooper: modules should have a sub-function, respond to only one type of input, impairing them shouldn't impair other modules, and have limited access to information outside their state. Can we quantify these? We came up with three quantified measures of functional modularity based on:

(1) probing (can we infer information a module shouldn't have from its activity),

(2) ablation (which sub-functions are impaired),

(3) dependency on data that should be irrelevant (with correlation).

We also designed a task and network designed to have maximal, controllable modularity. There are two modules (dense recurrent neural networks) with sparse interconnections. Each receives a separate input. Solving the task requires they share precisely one bit of information.

Roughly speaking, the task is that each module is given one digit to observe. If the parity of the two digits is the same (both even or both odd) then return the first digit, otherwise the second digit. You can solve this by having each module only communicate one parity bit to the other.

=> View attached media

=> More informations about this toot | More toots from neuralreckoning@neuromatch.social

Toot

Written by Dan Goodman on 2025-01-23 at 17:11

@GabrielBena

By varying the number of connections between the modules we can fully span the possible range of structural modularity (measured with the widely used graph-theoretic Q metric), train on the task using backprop, then measure how much each module specialises on its own inputs.

Good news! Firstly, all the measures of specialisation qualitatively agree, so we're measuring something real 🤞. When the two modules are fully connected to each other, we don't see any specialisation and when they're maximally modular we do. This is what we'd expect. 😅 All good then? Well...

What surprised us is how much structural modularity you need before you observe specialisation. You need Q>0.4, higher than you observe in the brain. So does this mean that structural and functional modularity are unrelated in practice? Not necessarily, there could be other mechanisms at play.

=> View attached media

=> More informations about this toot | More toots from neuralreckoning@neuromatch.social

Descendants

Written by Dan Goodman on 2025-01-23 at 17:12

@GabrielBena

Our intuition suggested that resource constraints are likely to be important: there's little incentive to specialise if you have infinite resources. Sure enough, when we did large parameter sweeps we see that you get more specialisation when resources (neurons, synapses) are tight.

This seems like an important insight: we see that resource constrained biological brains are great at generalisation, an expected outcome of having specialised modules with generalisable functions, while machine learning systems are not. Maybe we give them too much computational power? 🤯

=> View attached media

=> More informations about this toot | More toots from neuralreckoning@neuromatch.social

Written by Dan Goodman on 2025-01-23 at 17:12

@GabrielBena

Finally, we made use of the fact that our networks are recurrent (which was a necessary restriction of the simple architecture that we used) and checked how specialisation changed over time. Intriguingly, it decreases over time. But there's more.

This drop in specialisation happens faster the more synapses between the modules and the less noise there is. This looks like maybe specialisation falls simply as a result of how much net communication bandwidth there is between the modules.

This raises the question: maybe specialisation isn't as simple as we think. Perhaps to some extent it's just a measurement artifact of limited communication bandwidth? Or maybe, understanding information flow is key to building systems that can specialise and generalise?

These are the sorts of questions we're following up on now, hopefully we'll have more to say about that soon, but in the meantime we'd love to discuss some of the issues and questions raised with you all.

What do you think?

=> More informations about this toot | More toots from neuralreckoning@neuromatch.social

Written by Andy Wootton on 2025-01-23 at 21:51

@neuralreckoning @GabrielBena Would speed of communication be more of a factor in some brain functions than others? In a high-speed parallel computer, you'd put the performance-critical components close together. Could evolution achieve the same?

=> More informations about this toot | More toots from woo@fosstodon.org

Written by Dan Goodman on 2025-01-23 at 22:02

@woo @GabrielBena that's what we're wondering too! Maybe the brain has some high speed low bandwidth lines of communication as well as high bandwidth low speed, for different purposes.

=> More informations about this toot | More toots from neuralreckoning@neuromatch.social

Written by Andy Wootton on 2025-01-23 at 23:43

@neuralreckoning @GabrielBena Brains seem very good at optimising energy use. Why go fast, if going slow will get you there in time, with energy left over to think about other, more interesting things?

=> More informations about this toot | More toots from woo@fosstodon.org

Written by Tim Hanson on 2025-01-23 at 20:17

@neuralreckoning @GabrielBena

Thanks for sharing, solid work. Some thoughts on this point:

"we see that resource constrained biological brains are great at generalisation, an expected outcome of having specialised modules with generalisable functions, while machine learning systems are not. Maybe we give them too much computational power?"

There seem to be \approx zero generalization penalties from overparameterization in networks trained via SGD. Eg: train a 4-layer MLP on MNIST. Performance is identical if the two hidden layers are anywhere from 128 - 16k wide.

Is it true that resource constraints engender better generalization in biological networks? (Trained, presumably, without SGD?). I see no plots or metrics of generalization - ?

Regarding functional specialization, imho how networks learn is a interplay between 'natural' spectral / eigenmode / Fourier-coefficient learning (axis-agnostic), and more interpretable axis-aligned learning, as driven by asymmetries in e.g. Adam or regularizers (or biology). Again, in practice forcing functional specialization tends to negatively impact ANN generalization performance; instead, it seems to be primarily a function the networks structure, and not of the quantity of gross computation therein (above a limit).

The former which, of course, you've varied, in an interesting way... so where's the generalization figure? :-)

=> More informations about this toot | More toots from m8ta@fediscience.org

Written by Dan Goodman on 2025-01-23 at 20:21

@m8ta @GabrielBena interesting stuff I'll have to think about. We actually have it on the to-do list to look at how well all this generalises because it's what we're thinking about more now and less when we wrote the paper!

=> More informations about this toot | More toots from neuralreckoning@neuromatch.social

Proxy Information

Original URL: gemini://mastogem.picasoft.net/thread/113878780460398253
Status Code: Success (20)
Meta: text/gemini
Capsule Response Time: 352.156168 milliseconds
Gemini-to-HTML Time: 5.011505 milliseconds

This content has been proxied by September (3851b).