summaryrefslogtreecommitdiff
path: root/day1/supervised-learning.txt
diff options
context:
space:
mode:
authorPhil Burton <phil@d3r.com>2019-02-25 13:37:59 +0000
committerPhil Burton <phil@d3r.com>2019-02-25 13:37:59 +0000
commit3431e667a5c6475043ebfd97b43a3fdc4b078596 (patch)
treecd9eb1249e42de8ee1c7e99fd83cb7f091637b7c /day1/supervised-learning.txt
parent4e8368f4d847e5c1352302fc53658dfab2c72a9b (diff)
Refactor and clean up notesHEADmaster
Diffstat (limited to 'day1/supervised-learning.txt')
-rw-r--r--day1/supervised-learning.txt77
1 files changed, 77 insertions, 0 deletions
diff --git a/day1/supervised-learning.txt b/day1/supervised-learning.txt
new file mode 100644
index 0000000..1dce7da
--- /dev/null
+++ b/day1/supervised-learning.txt
@@ -0,0 +1,77 @@
+# Learning: the hows and whys of machine learning
+
+Liam Wiltshire
+https://liam-wiltshire.github.io/talks/?talk=machinelearning&conference=phpuk
+https://joind.in/event/php-uk-conference-2019/learning-the-hows-and-whys-of-machine-learning
+
+## Overivew
+
+Charge backs
+
+## Supervised learning
+Training data
+Learning functions
+Categorisation / Classification
+Regression - Where do we sit on a line
+
+## Naive Bayes classifier
+Standardise words
+- Un pluralise
+- Un gender
+- Un tense
+- etc
+
+More data == better
+
+## Tokenisation
+https://en.wikipedia.org/wiki/Benford%27s_law
+https://php-ml.readthedocs.io
+
+Unique tokens for each unique context
+
+## Imbalanced data
+One category has more database
+99% data not charge back
+Just being accurate, not very helpful
+ - Started by flagging 100% as fine.
+ - Need to collect more data, change methods, resample data
+
+## Understand data
+- context
+- Common data vs specific data
+- Continuous vs discrete data
+
+## KNN
+K Nearest Number
+https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm
+ - Distances
+ - less sensitive to imbalance
+ - Keep K odd (no draws)
+
+## Handling nominal data
+
+Binary
+- Increase amounts of dimensions
+- normalisation required
+- equal scales
+
+## Contextless data is meaningless
+Is it normal?
+
+## Next to try
+Weighting
+Different dimensions
+Change K value (was 3NN)
+Remove outliers
+Diff distance function
+weighted distance
+
+
+
+
+# Useful links
+https://en.wikipedia.org/wiki/Benford%27s_law
+https://php-ml.readthedocs.io
+https://liam-wiltshire.github.io/talks/?talk=machinelearning&conference=phpuk
+https://joind.in/event/php-uk-conference-2019/learning-the-hows-and-whys-of-machine-learning
+https://en.wikipedia.org/wiki/K-nearest_neighbors_algorithm