Reading

Relevant Readings and Papers for the Course

Individual Topics References

There is no official textbook for this course. The primary textbooks listed at the top of this page are a useful resource that cover all the fundamentals we do in class. The rest of the papers and resources listed below are seminal (ie. famous and influential) or otherwise interesting papers on the topics of the course.

Discussion and Annotation of Papers

Many papers referred to in the course will should have a hypothesis link associated with it.

Hypothesis is a website which allows very nice commenting, annotation and sharing of papers. We have a group for the course on the site: (Hypothes.is Group : ECE657A). The navigation on the Hypothesis site is not very good, so you can use the links on this paper to find related papers and jump straight to the hypothesis page. If there isn’t one, you can create it and ask the course staff to add the hypothesis link.

Feel free to comment, annotate and discuss papers on the site in the ECE657A group. The Hypothes.is annotation discussion for the course is lightly moderated. If you encounter an offensive or inappropriate comment you can flag the annotation and the course administrator (Prof or TA) will see if and can choose to remove or deal with the comment. The original poster will not see your flag.

Jump to Topic: text ~ seminal ~ machine-learning ~ dimensionality-reduction ~ auto encoders ~ kernel-methods ~ ensemble-methods ~ deep-learning ~ unsupervised-learning ~ variational-inference ~ support-vector-machines ~ convolutional-network ~ recurrent-networks ~ anomaly-detection ~ data-augmentation ~ loss-functions ~ optimizer-gradient-methods ~ ablation-study ~ natural-language-processing ~ attention-mechanism ~ transformers ~ transfer-learning ~ active-learning ~ ai-for-science

text

  1. Machine Learning: A Probabilistic Perspective
    Murphy, Kevin
    2012.
    keywords: textbooks-optional
  2. Pattern Classification
    Duda, R O, Hart, P E, and Stork, D G
    2000.
    keywords: textbooks-optional
  3. Deep Learning
    Goodfellow, Ian, Bengio, Yoshua, and Courville, Aaron
    2016.
    keywords: textbooks-optional

seminal

  1. Deep Learning of Representations for Unsupervised and Transfer Learning
    Bengio, Yoshua
    In Proceedings of ICML Workshop on Unsupervised and Transfer Learning. Bellevue, Washington, USA , 2012.
    keywords: machine-learning ~ transfer-learning ~ representation-learning ~ seminal ~ lecture-1221
  2. A tutorial on speech understanding systems.
    Newell, Allen
    1975.
    keywords: artificial-intellgience ~ ablation-study ~ seminal ~ natural-language-processing
  3. U-NET
    U-Net: Convolutional Networks for Biomedical Image Segmentation
    Ronneberger, Olaf, Fischer, Philipp, and Brox, Thomas
    In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Cham , 2015.
    keywords: deep-learning ~ medical-imaging ~ convolutional-network ~ data-augmentation ~ representation-learning ~ seminal
  4. Glove: Global Vectors for Word Representation
    Pennington, Jeffrey, Socher, Richard, and Manning, Christopher
    In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar , 2014.
    keywords: natural-language-processing ~ seminal ~ machine-learning ~ representation-learning
  5. Speech Recognition: A Tutorial Overview
    White, G. M.
    Computer. 1976.
  6. OC-SVM
    Support vector method for novelty detection
    Schölkopf, Bernhard, Williamson, Robert C, Smola, Alex J, Shawe-Taylor, John, and Platt, John C
    In NeurIPS conference. 2000.
  7. Deep learning
    LeCun, Yann, Bengio, Yoshua, and Hinton, Geoffrey
    Nature. 2015.
    keywords: seminal ~ deep-learning ~ neural-networks ~ machine-learning
  8. Attention is All you Need
    Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Łukasz, and Polosukhin, Illia
    Advances in Neural Information Processing Systems. Long Beach, California , 2017.

machine-learning

  1. Deep Learning of Representations for Unsupervised and Transfer Learning
    Bengio, Yoshua
    In Proceedings of ICML Workshop on Unsupervised and Transfer Learning. Bellevue, Washington, USA , 2012.
    keywords: machine-learning ~ transfer-learning ~ representation-learning ~ seminal ~ lecture-1221
  2. The Genius Neuroscientist Who Might Hold the Key to True AI
    Raviv, Shaun
    2018.
    keywords: theory ~ machine-learning ~ active-learning ~ free-energy ~ artificial-intellgience ~ neuroscience ~ course-part-1 ~ fun-reading
  3. Glove: Global Vectors for Word Representation
    Pennington, Jeffrey, Socher, Richard, and Manning, Christopher
    In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar , 2014.
    keywords: natural-language-processing ~ seminal ~ machine-learning ~ representation-learning
  4. Speech Recognition: A Tutorial Overview
    White, G. M.
    Computer. 1976.
  5. Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche
    Coupé, Christophe, Oh, Yoon Mi, Dediu, Dan, and Pellegrino, François
    Science Advances. 2019.
    keywords: natural-language-processing ~ representation-learning ~ machine-learning ~ compact-encoding ~ science ~ psychology
  6. Incremental local outlier detection for data streams
    Pokrajac, Dragoljub, Lazarevic, Aleksandar, and Latecki, Longin Jan
    In 2007 IEEE symposium on CIDM. 2007.
    keywords: anomaly-detection ~ streaming-ensembles ~ machine-learning
  7. OC-SVM
    Support vector method for novelty detection
    Schölkopf, Bernhard, Williamson, Robert C, Smola, Alex J, Shawe-Taylor, John, and Platt, John C
    In NeurIPS conference. 2000.
  8. Deep learning
    LeCun, Yann, Bengio, Yoshua, and Hinton, Geoffrey
    Nature. 2015.
    keywords: seminal ~ deep-learning ~ neural-networks ~ machine-learning
  9. Attention is All you Need
    Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Łukasz, and Polosukhin, Illia
    Advances in Neural Information Processing Systems. Long Beach, California , 2017.
  10. Intuitive Understanding of Attention Mechanism in Deep Learning
    Lamba, Harshall
    2019.
    keywords: machine-learning ~ deep-learning ~ attention-mechanism ~ blog ~ website
  11. How Does Attention Work in Encoder-Decoder Recurrent Neural Networks
    Brownlee, Jason
    2017.
    keywords: attention-mechanism ~ blog ~ machine-learning ~ website ~ autoencoders ~ representation-learning ~ deep-learning ~ blog

dimensionality-reduction

  1. Fisher and Kernel Fisher Discriminant Analysis: Tutorial
    Benyamin Ghojogh, Fakhri Karray, Mark Crowley,
    . 2019.
    keywords: kernel-methods; dimensionality-reduction
  2. Multidimensional Scaling, Sammon Mapping, and Isomap: Tutorial and Survey
    Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley,
    .
    keywords: dimensionality-reduction; manifold-learning
  3. Stochastic Neighbor Embedding with Gaussian and Student-t Distributions: Tutorial and Survey
    Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley,
    . 2020.
    keywords: dimensionality-reduction; probability-distributions
  4. Unsupervised and Supervised Principal Component Analysis: Tutorial
    Benyamin Ghojogh, Mark Crowley,
    . 2019.

auto encoders

    kernel-methods

    1. Fisher and Kernel Fisher Discriminant Analysis: Tutorial
      Benyamin Ghojogh, Fakhri Karray, Mark Crowley,
      . 2019.
      keywords: kernel-methods; dimensionality-reduction
    2. OC-SVM
      Support vector method for novelty detection
      Schölkopf, Bernhard, Williamson, Robert C, Smola, Alex J, Shawe-Taylor, John, and Platt, John C
      In NeurIPS conference. 2000.

    ensemble-methods

    1. Mondrian forests: Efficient online random forests
      Lakshminarayanan, Balaji, Roy, Daniel M, and Teh, Yee Whye
      In NeurIPS conference. 2014.
      keywords: ensemble-methods ~ streaming-ensembles
    2. Mondrian forests for large-scale regression when uncertainty matters
      Lakshminarayanan, Balaji, Roy, Daniel M, and Teh, Yee Whye
      In Artificial Intelligence and Statistics. 2016.
      keywords: ensemble-methods ~ streaming-ensembles
    3. The Mondrian Process
      Roy, Daniel M, and Teh, Yee Whye
      In NeurIPS conference. 2008.
      keywords: ensemble-methods ~ streaming-ensembles
    4. On-line random forests
      Saffari, Amir, Leistner, Christian, Santner, Jakob, Godec, Martin, and Bischof, Horst
      In 2009 IEEE ICCV workshops. 2009.
      keywords: ensemble-methods ~ streaming-ensembles
    5. Streaming random forests
      Abdulsalam, Hanady, Skillicorn, David B, and Martin, Patrick
      In 11th IDEAS 2007. 2007.
      keywords: ensemble-methods ~ streaming-ensembles
    6. Extremely randomized trees
      Geurts, Pierre, Ernst, Damien, and Wehenkel, Louis
      Machine learning. 2006.
      keywords: ensemble-methods
    7. Decision forests: A unified framework for classification, regression, density estimation, manifold learning and semi-supervised learning
      Criminisi, Antonio, Shotton, Jamie, and Konukoglu, Ender
      Foundations and Trends&#0174 in Computer Graphics and Vision. 2012.
      keywords: ensemble-methods
    8. Random forests
      Breiman, Leo
      Machine learning. 2001.
      keywords: ensemble-methods
    9. Binary Space Partitioning Forests
      Fan, Xuhui, Li, Bin, and Sisson, Scott Anthony
      In 22nd AISTATS conference. 2019.
      keywords: ensemble-methods
    10. The binary space partitioning-tree process
      Fan, Xuhui, Li, Bin, and Sisson, Scott Anthony
      In 21st AISTATS conference. 2018.
      keywords: ensemble-methods
    11. Understanding random forests: From theory to practice
      Louppe, Gilles
      2014.
      keywords: ensemble-methods

    deep-learning

    1. A Comparative Study of Deep Learning Loss Functions for Multi-Label Remote Sensing Image Classification
      Yessou, Hichame, Sumbul, Gencer, and Demir, Begüm
      In IEEE International Geoscience and Remote Sensing Symposium. 2020.
      keywords: computer-vision ~ course-review-definitions ~ deep-learning ~ loss-functions
    2. U-NET
      U-Net: Convolutional Networks for Biomedical Image Segmentation
      Ronneberger, Olaf, Fischer, Philipp, and Brox, Thomas
      In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Cham , 2015.
      keywords: deep-learning ~ medical-imaging ~ convolutional-network ~ data-augmentation ~ representation-learning ~ seminal
    3. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
      Bai, Shaojie, Kolter, J. Zico, and Koltun, Vladlen
      arXiv:1803.01271 [cs]. 2018.
    4. Deep learning
      LeCun, Yann, Bengio, Yoshua, and Hinton, Geoffrey
      Nature. 2015.
      keywords: seminal ~ deep-learning ~ neural-networks ~ machine-learning
    5. An overview of gradient descent optimization algorithms
      Ruder, Sebastian
      arXiv:1609.04747 [cs]. 2017.
    6. Attention is All you Need
      Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Łukasz, and Polosukhin, Illia
      Advances in Neural Information Processing Systems. Long Beach, California , 2017.
    7. Intuitive Understanding of Attention Mechanism in Deep Learning
      Lamba, Harshall
      2019.
      keywords: machine-learning ~ deep-learning ~ attention-mechanism ~ blog ~ website
    8. How Does Attention Work in Encoder-Decoder Recurrent Neural Networks
      Brownlee, Jason
      2017.
      keywords: attention-mechanism ~ blog ~ machine-learning ~ website ~ autoencoders ~ representation-learning ~ deep-learning ~ blog
    9. Data Efficient and Weakly Supervised Computational Pathology on Whole Slide Images
      Lu, Ming Y., Williamson, Drew F. K., Chen, Tiffany Y., Chen, Richard J., Barbieri, Matteo, and Mahmood, Faisal
      arXiv:2004.09666 [cs, eess, q-bio]. 2020.
      keywords: attention-mechanism ~ classification ~ clustering ~ deep-learning ~ medical-imaging ~ computer-vision ~ digital-pathology

    unsupervised-learning

    1. Unsupervised word embeddings capture latent knowledge from materials science literature
      Tshitoyan, Vahe, Dagdelen, John, Weston, Leigh, Dunn, Alexander, Rong, Ziqin, Kononova, Olga, Persson, Kristin A., Ceder, Gerbrand, and Jain, Anubhav
      Nature. 2019.
      keywords: representation-learning ~ ai-for-material-design ~ ai-for-science ~ unsupervised-learning ~ embeddings ~ proj-chemgymrl ~ nlp ~ natural-language-processing

    variational-inference

    1. Factor Analysis, Probabilistic Principal Component Analysis, Variational Inference, and Variational Autoencoder: Tutorial and Survey
      Benyamin Ghojogh, Ali Ghodsi, Fakhri Karray, Mark Crowley,
      . 2020.
      keywords: course-diver-deeper-into-a-topic; factor-analysis; variational-inference

    support-vector-machines

    1. OC-SVM
      Support vector method for novelty detection
      Schölkopf, Bernhard, Williamson, Robert C, Smola, Alex J, Shawe-Taylor, John, and Platt, John C
      In NeurIPS conference. 2000.

    convolutional-network

    1. Are Pre-trained Convolutions Better than Pre-trained Transformers?
      Tay, Yi, Dehghani, Mostafa, Gupta, Jai, Bahri, Dara, Aribandi, Vamsi, Qin, Zhen, and Metzler, Donald
      In ACL 2021. 2021.
    2. Pay Less Attention with Lightweight and Dynamic Convolutions
      Wu, Felix, Fan, Angela, Baevski, Alexei, Dauphin, Yann N., and Auli, Michael
      In International Conference on Learning Representations (ICLR 2019). 2019.
    3. CNN-Course
      Stanford:CS231n Convolutional Neural Networks for Visual Recognition
      Li, Fei-Fei
      2021.
      keywords: cnns ~ deep learning ~ convolutional-network ~ reference
    4. U-NET
      U-Net: Convolutional Networks for Biomedical Image Segmentation
      Ronneberger, Olaf, Fischer, Philipp, and Brox, Thomas
      In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Cham , 2015.
      keywords: deep-learning ~ medical-imaging ~ convolutional-network ~ data-augmentation ~ representation-learning ~ seminal

    recurrent-networks

    1. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
      Bai, Shaojie, Kolter, J. Zico, and Koltun, Vladlen
      arXiv:1803.01271 [cs]. 2018.

    anomaly-detection

    1. Fast Anomaly Detection for Streaming Data
      Tan, Swee Chuan, Ting, Kai Ming, and Liu, Tony Fei
      .
      keywords: anomaly-detection
    2. Anomaly detection: A survey
      Chandola, Varun, Banerjee, Arindam, and Kumar, Vipin
      ACM computing surveys (CSUR). 2009.
      keywords: anomaly-detection
    3. Isolation forest
      Liu, Fei Tony, Ting, Kai Ming, and Zhou, Zhi-Hua
      In 2008 Eighth IEEE International Conference on Data Mining. 2008.
      keywords: anomaly-detection
    4. Isolation-based anomaly detection
      Liu, Fei Tony, Ting, Kai Ming, and Zhou, Zhi-Hua
      ACM Transactions on Knowledge Discovery from Data (TKDD). 2012.
      keywords: anomaly-detection
    5. LOF: identifying density-based local outliers
      Breunig, Markus M, Kriegel, Hans-Peter, Ng, Raymond T, and Sander, Jörg
      In ACM sigmod record. 2000.
      keywords: anomaly-detection
    6. Incremental local outlier detection for data streams
      Pokrajac, Dragoljub, Lazarevic, Aleksandar, and Latecki, Longin Jan
      In 2007 IEEE symposium on CIDM. 2007.
      keywords: anomaly-detection ~ streaming-ensembles ~ machine-learning
    7. A review of novelty detection
      Pimentel, Marco AF, Clifton, David A, Clifton, Lei, and Tarassenko, Lionel
      Signal Processing. 2014.
      keywords: anomaly-detection
    8. OC-SVM
      Support vector method for novelty detection
      Schölkopf, Bernhard, Williamson, Robert C, Smola, Alex J, Shawe-Taylor, John, and Platt, John C
      In NeurIPS conference. 2000.
    9. Outlier Detection Data Sets
      Rayana, Shebuti
      2019.
      keywords: anomaly-detection
    10. Intrusion Detection Evaluation Dataset (CICIDS2017)
      Canadian Institute for Cybersecurity,
      2017.
      keywords: anomaly-detection
    11. Toward Generating a New Intrusion Detection Dataset and Intrusion Traffic Characterization.
      Sharafaldin, Iman, Lashkari, Arash Habibi, and Ghorbani, Ali A
      In ICISSP. 2018.
      keywords: anomaly-detection
    12. Anomaly detection via over-sampling principal component analysis
      Yeh, Yi-Ren, Lee, Zheng-Yi, and Lee, Yuh-Jye
      2009.
      keywords: anomaly-detection
    13. Anomaly detection via online oversampling principal component analysis
      Lee, Yuh-Jye, Yeh, Yi-Ren, and Wang, Yu-Chiang Frank
      IEEE transactions on knowledge and data engineering. 2013.
      keywords: anomaly-detection
    14. Online anomaly detection using KDE
      Ahmed, Tarem
      In 2009 IEEE conference on global telecommunications. 2009.
      keywords: anomaly-detection

    data-augmentation

    1. U-NET
      U-Net: Convolutional Networks for Biomedical Image Segmentation
      Ronneberger, Olaf, Fischer, Philipp, and Brox, Thomas
      In Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. Cham , 2015.
      keywords: deep-learning ~ medical-imaging ~ convolutional-network ~ data-augmentation ~ representation-learning ~ seminal
    2. NeurIPS 2020 : Unsupervised Data Augmentation for Consistency Training
      keywords: medical-imaging ~ data-augmentation ~ natural-language-processing

    loss-functions

    1. A Comparative Study of Deep Learning Loss Functions for Multi-Label Remote Sensing Image Classification
      Yessou, Hichame, Sumbul, Gencer, and Demir, Begüm
      In IEEE International Geoscience and Remote Sensing Symposium. 2020.
      keywords: computer-vision ~ course-review-definitions ~ deep-learning ~ loss-functions

    optimizer-gradient-methods

    1. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
      Bai, Shaojie, Kolter, J. Zico, and Koltun, Vladlen
      arXiv:1803.01271 [cs]. 2018.
    2. Adam: A method for stochastic optimization
      Kingma, Diederik P, and Ba, Jimmy
      arXiv preprint arXiv:1412.6980. 2014.
      keywords: optimizer-gradient-methods ~ optimizer-adam
    3. An overview of gradient descent optimization algorithms
      Ruder, Sebastian
      arXiv:1609.04747 [cs]. 2017.

    ablation-study

    1. A tutorial on speech understanding systems.
      Newell, Allen
      1975.
      keywords: artificial-intellgience ~ ablation-study ~ seminal ~ natural-language-processing
    2. Selective Search for Object Recognition
      Uijlings, J. R. R., Sande, K. E. A., Gevers, T., and Smeulders, A. W. M.
      International Journal of Computer Vision. 2013.
      keywords: ablation-study
    3. Rich feature hierarchies for accurate object detection and semantic segmentation
      Girshick, Ross, Donahue, Jeff, Darrell, Trevor, and Malik, Jitendra
      arXiv:1311.2524 [cs]. 2014.
      keywords: ablation-study ~ computer-vision ~ representation-learning ~ object-detection
    4. Analysing differences between algorithm configurations through ablation
      Fawcett, Chris, and Hoos, Holger H.
      Journal of Heuristics. 2016.
      keywords: ablation-study

    natural-language-processing

    1. Unsupervised word embeddings capture latent knowledge from materials science literature
      Tshitoyan, Vahe, Dagdelen, John, Weston, Leigh, Dunn, Alexander, Rong, Ziqin, Kononova, Olga, Persson, Kristin A., Ceder, Gerbrand, and Jain, Anubhav
      Nature. 2019.
      keywords: representation-learning ~ ai-for-material-design ~ ai-for-science ~ unsupervised-learning ~ embeddings ~ proj-chemgymrl ~ nlp ~ natural-language-processing
    2. A tutorial on speech understanding systems.
      Newell, Allen
      1975.
      keywords: artificial-intellgience ~ ablation-study ~ seminal ~ natural-language-processing
    3. Delphi: Towards Machine Ethics and Norms
      Jiang, Liwei, Hwang, Jena D., Bhagavatula, Chandrasekhar, Bras, Ronan Le, Forbes, Maxwell, Borchardt, Jon, Liang, Jenny, Etzioni, Oren, Sap, Maarten, and Choi, Yejin
      2021.
      keywords: ai-ethics ~ ai-ethics ~ ai-for-good ~ natural-language-processing ~ crowdsourcing ~ fun-reading
    4. Glove: Global Vectors for Word Representation
      Pennington, Jeffrey, Socher, Richard, and Manning, Christopher
      In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar , 2014.
      keywords: natural-language-processing ~ seminal ~ machine-learning ~ representation-learning
    5. Speech Recognition: A Tutorial Overview
      White, G. M.
      Computer. 1976.
    6. Different languages, similar encoding efficiency: Comparable information rates across the human communicative niche
      Coupé, Christophe, Oh, Yoon Mi, Dediu, Dan, and Pellegrino, François
      Science Advances. 2019.
      keywords: natural-language-processing ~ representation-learning ~ machine-learning ~ compact-encoding ~ science ~ psychology
    7. NeurIPS 2020 : Unsupervised Data Augmentation for Consistency Training
      keywords: medical-imaging ~ data-augmentation ~ natural-language-processing
    8. Improving Language Understanding by Generative Pre-Training
      Radford, Alec, Narasimhan, Karthik, Salimans, Tim, and Sutskever, Ilya
      .
      keywords: natural-language-processing; attention-mechanism
    9. Attention is All you Need
      Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Łukasz, and Polosukhin, Illia
      Advances in Neural Information Processing Systems. Long Beach, California , 2017.

    attention-mechanism

    1. Are Pre-trained Convolutions Better than Pre-trained Transformers?
      Tay, Yi, Dehghani, Mostafa, Gupta, Jai, Bahri, Dara, Aribandi, Vamsi, Qin, Zhen, and Metzler, Donald
      In ACL 2021. 2021.
    2. Pay Less Attention with Lightweight and Dynamic Convolutions
      Wu, Felix, Fan, Angela, Baevski, Alexei, Dauphin, Yann N., and Auli, Michael
      In International Conference on Learning Representations (ICLR 2019). 2019.
    3. Improving Language Understanding by Generative Pre-Training
      Radford, Alec, Narasimhan, Karthik, Salimans, Tim, and Sutskever, Ilya
      .
      keywords: natural-language-processing; attention-mechanism
    4. Attention is All you Need
      Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Łukasz, and Polosukhin, Illia
      Advances in Neural Information Processing Systems. Long Beach, California , 2017.
    5. Intuitive Understanding of Attention Mechanism in Deep Learning
      Lamba, Harshall
      2019.
      keywords: machine-learning ~ deep-learning ~ attention-mechanism ~ blog ~ website
    6. How Does Attention Work in Encoder-Decoder Recurrent Neural Networks
      Brownlee, Jason
      2017.
      keywords: attention-mechanism ~ blog ~ machine-learning ~ website ~ autoencoders ~ representation-learning ~ deep-learning ~ blog
    7. Data Efficient and Weakly Supervised Computational Pathology on Whole Slide Images
      Lu, Ming Y., Williamson, Drew F. K., Chen, Tiffany Y., Chen, Richard J., Barbieri, Matteo, and Mahmood, Faisal
      arXiv:2004.09666 [cs, eess, q-bio]. 2020.
      keywords: attention-mechanism ~ classification ~ clustering ~ deep-learning ~ medical-imaging ~ computer-vision ~ digital-pathology

    transformers

    1. Are Pre-trained Convolutions Better than Pre-trained Transformers?
      Tay, Yi, Dehghani, Mostafa, Gupta, Jai, Bahri, Dara, Aribandi, Vamsi, Qin, Zhen, and Metzler, Donald
      In ACL 2021. 2021.
    2. Pay Less Attention with Lightweight and Dynamic Convolutions
      Wu, Felix, Fan, Angela, Baevski, Alexei, Dauphin, Yann N., and Auli, Michael
      In International Conference on Learning Representations (ICLR 2019). 2019.
    3. Attention is All you Need
      Vaswani, Ashish, Shazeer, Noam, Parmar, Niki, Uszkoreit, Jakob, Jones, Llion, Gomez, Aidan N., Kaiser, Łukasz, and Polosukhin, Illia
      Advances in Neural Information Processing Systems. Long Beach, California , 2017.

    transfer-learning

    1. Deep Learning of Representations for Unsupervised and Transfer Learning
      Bengio, Yoshua
      In Proceedings of ICML Workshop on Unsupervised and Transfer Learning. Bellevue, Washington, USA , 2012.
      keywords: machine-learning ~ transfer-learning ~ representation-learning ~ seminal ~ lecture-1221

    active-learning

    1. The Genius Neuroscientist Who Might Hold the Key to True AI
      Raviv, Shaun
      2018.
      keywords: theory ~ machine-learning ~ active-learning ~ free-energy ~ artificial-intellgience ~ neuroscience ~ course-part-1 ~ fun-reading

    ai-for-science

    1. Unsupervised word embeddings capture latent knowledge from materials science literature
      Tshitoyan, Vahe, Dagdelen, John, Weston, Leigh, Dunn, Alexander, Rong, Ziqin, Kononova, Olga, Persson, Kristin A., Ceder, Gerbrand, and Jain, Anubhav
      Nature. 2019.
      keywords: representation-learning ~ ai-for-material-design ~ ai-for-science ~ unsupervised-learning ~ embeddings ~ proj-chemgymrl ~ nlp ~ natural-language-processing