MATH 5900-100 (#12959), Spring
2025
Special
Topics in Mathematics: Machine Learning Optimization Algorithms
Syllabus
Official Catalog:
- Description: Specific course content will vary with offering.
- Credit hours: 1-15
- Learning Outcome: Students will increase their knowledge in
Mathematics.
This offering:
- Description: A seminar-style exploration of the optimization
algorithms most commonly used in Machine Learning. Specifically, we will
study TensorFlow’s optimizers in tf.keras.optimizers
and the papers upon which they are based.
- Credit hours: 1
- Learning Outcome: Students will be able to explain the optimization
algorithms commonly used in Machine Learning, their properties, and the
motivation for them.
Class hours/ location: Tuesdays 2:00-2:55pm
- Morton Hall 313
- Will also be on MS Teams.
Web page: http://www.ohiouniversityfaculty.com/mohlenka/2255/5900/
Instructor: Martin J.
Mohlenkamp, mohlenka@ohio.edu, Morton Hall 321C, 740-593-1259. The
best ways to reach me are email and Teams.
Office hours:
- Monday and Friday 10:45 AM - 11:40 AM; Wednesday 8:35 AM - 9:30
AM.
- I am available to meet at other times, just contact me for an
appointment.
Graded CR or F. To earn CR, you need to:
- Attend at least 50% of the meetings.
- Attending online counts as 1/2 of an attendance.
- Present on one of the algorithms:
- Explain the algorithm and give its formulas.
- Explain its properties and state any theorems about it.
- Explain the motivation for having an algorithm with those
properties.
Religious Accommodations: In accordance with the university’s Interim
policy on reasonable religious accommodations [40.003]:
You may be absent for up to three (3) days each academic semester,
without penalty, to take time off for reasons of faith or religious or
spiritual belief system or to participate in organized activities
conducted under the auspices of a religious denomination, church, or
other religious or spiritual organization. You are required to notify me
in writing of specific dates requested for alternative accommodations no
later than fourteen (14) days after the first day of instruction. These
requests will remain confidential. For more information about this
policy, students/you may contact the Director and Title IX Coordinator,
Equity and Civil Rights Compliance, Lindley Hall 006, 740-593-9140,
equity@ohio.edu.
Special Needs: If you have specific physical, psychiatric, or
learning disabilities and require accommodations, please let me know as
soon as possible so that your learning needs may be appropriately met.
You should also register with Accessibility Services to
obtain written documentation and to learn about the resources they have
available.
Responsible Employee Reporting Obligation: If I learn of any
instances of sexual harassment, sexual violence, and/or other forms of
prohibited discrimination, I am required to report them. If you wish to
share such information in confidence, then use the Office of Equity and
Civil Rights Compliance.
Schedule
(Draft. As we fall behind, topics will be pushed back.)
- January 14: Organize the algorithms and set the schedule of when to
discuss them and who is presenting which one.
- January 21: (Riley) SGD
- January 28: (Daniel) RMSprop.
Includes a reference Hinton,
2012 to a set of lectures:
- Lecture 6a: Overview of mini‐batch gradient descent
- Lecture 6b: A bag of tricks for mini-batch gradient descent
- Lecture 6c: The momentum method
- Lecture 6d: A separate, adaptive learning rate for each
connection
- Lecture 6e: rmsprop: Divide the gradient by a running average of its
recent magnitude
- February 4: (Alfred) Adagrad.
Includes a reference Duchi
et al., 2011 to a paper Adaptive Subgradient Methods for Online
Learning and Stochastic Optimization.
- February 11: Finish Adagrad
- February 18: (Enoch) Adadelta,
which is an extension of Adagrad. Includes a reference to Zeiler, 2012 to a paper
ADADELTA: An Adaptive Learning Rate Method.
- February 25: (Nutifafa) Adam.
Reference Kingma et al.,
2014 to the paper Adam: A Method for Stochastic
Optimization.
- March 4: (Solomon) Adamax,
which is Adam but using the infinity norm.
- March 11: No meeting since is Spring break.
- March 18: (Nutifafa and Solomon) Nadam,
which is Adam but with Nesterov momentum. Includes a reference Dozat,
2015 to a technical report Incorporating Nesterov Momentum into
Adam.
- March 25: (Edwin) Adafactor.
Includes a reference Shazeer,
Noam et al., 2018. to a paper Adafactor: Adaptive Learning Rates
with Sublinear Memory Cost.
- April 1: (Jessica) Lion.
Includes a reference Chen et
al., 2023 to a paper Symbolic Discovery of Optimization
Algorithms.
- April 8: catch up
- April 15: (Fatma) AdamW,
which is Adam with weight decay. Includes a reference Loshchilov, Hutter et al.,
2019 to the paper Decoupled Weight Decay
Regularization.
- April 22: ?? (Martin?) Ftrl.
Includes a reference McMahan
et al., 2013 to a paper Ad Click Prediction: a View from the
Trenches.