Conversation
| * makes one using the magic seed 0. | ||
| */ | ||
| void addNoise(Real fractionNoise); | ||
| void addNoise(Real fractionNoise); //TODO the name is confusing, rename to shuffle ? |
There was a problem hiding this comment.
is it OK to rename this to shuffle() and have addNoise for the new fn? @ctrl-z-9000-times
There was a problem hiding this comment.
I looked at your new addNoise function and I think it will have issues with keeping the sparsity at a reasonable level. I think that the sparsity of an SDR after this method is called on it will always tend towards 50%.
There was a problem hiding this comment.
that would be indeed wrong. What I intended:
- have SDR of current input
- flip 0.01% bits
- have a new SDR
- flip 0.01%bits
So the sparsity would remain the same (actially grow, because we have much more off bits, so flipping on would be more probable). But it should remain the x% (2%) + 0.001%
| void SparseDistributedRepresentation::addNoise2(const Real probability, Random& rng) { | ||
| NTA_ASSERT( probability >= 0.0f and probability <= 1.0f ); | ||
| const ElemSparse numFlip = static_cast<ElemSparse>(size * probability); | ||
| if (numFlip == 0) return; |
There was a problem hiding this comment.
I'm trying to write an effective implementation, but this has problem with p << size. Should we bother with such cases? return/assert?
| input.addNoise2(0.01f, rng_); //TODO apply at synapse level in Conn? | ||
| //TODO fix for probability << input.size | ||
| //TODO apply killCells to active output? | ||
| //TODO apply dropout to segments? (so all are: synapse, segment, cell/column) |
There was a problem hiding this comment.
proof of concept dropout applied to input (as noise) and output (as killCells).
- I'd prefer this be applied in Connections (in adaptSegment?)
- where to apply?
- ideally all of: SP, TM. & synapse, segment, cell, column
- but that would be computationally infeasible, so..?
|
Deterministic test are still expected to fail, until we decide on values and update the exact outputs. |
|
Maybe I don't understand this change, but it seems this will make the HTM perform worse. While it's interesting that the HTM keeps working even when some of its components are disabled, I don't think this belongs in the mainline. Maybe instead you could make an example/demonstration of these fault-tolerance properties (like numenta did in their SP paper). |
it's commonly used in deeplearning where it improves a lot. To be exact, dropout helps to prevent overfitting. While HTM is already more robust to that (sparse SDR for output, stimulus threshold on input segments) I want to see if this helps and how much. I am looking for biological confirmation and datasets to proof if this works better. (It does slowdown a bit but that is an implementation detail).
umm..no components are disabled permanently, this temporarily flips the bit, adding noise to the input. |
|
Hotgym example internally uses dropout |
WIP dropout implementation
EDIT:
Motivation: I believe this change can be considered biological (noise on signal during transfer) and also acording to deep learning (more robust representations). It should be supported by measurable of SDR quality #155