Difference between revisions of "Step 1"

From Research management course
Jump to: navigation, search
 
(44 intermediate revisions by the same user not shown)
Line 1: Line 1:
This is a preparatory course for the main part of My first scientific paper. Its goal is to distribute works of scientific research evenly over a year.  
+
The most important things come first. We discuss the main message, delivered by a scientific paper. We explore the first three elements of a scientific paper: the abstract, the highlights, and the keywords. The main message shall reveal itself through all elements of the paper. But we leave the rest of it for the next time. Namely, the title, introduction, problem statement, goal of the computational experiment, and conclusion are left behind. We select a paper and exercise in the reconstruction of these three elements.  
  
Scientific research is a collective activity, and your main goal during this course is to find a scientific advisor who devotes their time to you in exchange for ''academic'' results. So, in the end, you have some skills in how to
+
== The seminar ==
# select a research topic,
+
# [https://forms.gle/FrUzQbRSLPTVRMXM9 The warm-up 3-minute test]
# critically analyze the literature,
+
# Model, Algorithm, Method: Machine learning in a nut-shell
# state the problem,
+
<!-- #* more terms: statistical hypothesis, algebraic structure, model selection, bayesian inference -->
# and convey your vell-reasoned message to the reader.
+
# Step 1 homework, how to read: the scheme <!--(how to search is a separate topic)-->
 
+
<!-- # Structure of the main message -->
But again, '''your main goal''' is to find a highly qualified scientific adviser (with their team) who gifts you valuable time.
+
# Structure of the abstract
 
+
<!-- # The second and the last slide of your talk -->
To accomplish the homework of this course, you may select a topic in applied mathematics or theoretical machine learning. Or better and we recommend it, you can change your topic after each step so you can feel various fields of your future expertise.
+
# Extracting keywords
 +
# Highlights: compressing the paper
 +
# Instastructure for your homework
 +
#* GitHub: organize the repository
 +
#* LaTeX: compile your file and commit without temporary files
 +
# The papers to select from
 +
# Step 0 homework results discussion
 +
# Optional GPT-role discussion
  
 
==Resources==
 
==Resources==
Step 1 Youtube video (expected with online version)
+
Step 1 YouTube [https://youtube.com/live/EZH3RdSXRtc video]
 +
'''Warning!''' A wrong microphone was used. This video will be rewritten in a couple of days.
 
<!--  
 
<!--  
 
* [https://youtu.be/5RVkgUOYiro Step 1 Youtube video]
 
* [https://youtu.be/5RVkgUOYiro Step 1 Youtube video]
Line 20: Line 28:
  
 
==Homework==
 
==Homework==
# [https://forms.gle/y6kbSz7wLX91stZ19 Fill out the form to attend the course]. Keep your email to log in.
+
# Set up your GitHub repository using [https://github.com/vadim-vic/the-Art-homework/ this template], see [https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template how]
# Set up the tools
+
# Select a paper to read from the list below
## Online LaTex is [https://www.overleaf.com/ Overleaf]  
+
# Reconstruct its
## Offline LaTex is [https://miktex.org/ MikTex] with [https://www.winedt.com/ WinEdt] or [https://pages.uoregon.edu/koch/texshop TexShop] with [https://www.xm1math.net/texmaker/ TeXMaker]
+
## Abstract
## Offline BibTex is [https://www.jabref.org JabRef]
+
## Keywords
## Sign up for [https://github.com GitHub] to keep your progress
+
## Highlights
# If you are not familiar with the [https://www.microsoft.com/en-us/research/uploads/prod/2006/01/Bishop-Pattern-Recognition-and-Machine-Learning-2006.pdf C.P. Bishop's book, 2006] or 2024, start reading
+
## Short motivation for why you selected this paper (no templates here, since it is an extra topic to discuss)
# Watch a useful lecture [https://www.youtube.com/watch?v=Unzc731iCUY "How to speak"] by Patrick Winston, 2018
+
# Compile and upload TEX and PDF to GitHub (no temporary files, please)
 +
# Fill out the [https://forms.gle/KqhRk9R6w61snAB9A Step 1 questionnaire]
 +
# Refresh in your memory the Linear models for the next warm-up test, either 
 +
## look for the terms [https://en.wikipedia.org/wiki/Dot_product dot product], [https://en.wikipedia.org/wiki/Scalar_projection scalar projection], [https://en.wikipedia.org/wiki/Linear_least_squares least squares], [https://en.wikipedia.org/wiki/Transformation_matrix linear map]  
 +
## or do fun-reading, the pages 33-39 from [https://klassfeldtheorie.wordpress.com/wp-content/uploads/2018/10/mathematische-methoden-310117.pdf the book] Section L3.  
  
==Fun==
+
'''Note''' that we always respect your credit hours. So please keep track of it.
Why do we need weekly homework? Look [https://www.youtube.com/shorts/Rvmvt7gscIM how two neurons connect one another]. So we learn as we train. <!-- [https://www.youtube.com/watch?v=ehbFoALnV4o how a chick's neurons develop] connecting the opposite side of the central nervous system. -->
 
  
==Transcript of the video==
+
'''Your profit''' here is your ability to find the main message of a paper.
Hello dear colleagues! 
 
  
This is an orientation hour for The Art of Scientific Research.
+
=== How to read ===
It is a preparatory course for your thesis in Applied Mathematics and Machine Learning.
+
There are many pieces of advice on how to read scientific papers, see [https://forums.fast.ai/t/how-to-read-research-papers-andrew-ng/66892 an example].
 +
<!-- including [https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7392212/ exhaustive ones].-->
 +
We suggest briefly looking through the paper's
 +
# highlight, or pitch in the abstract,
 +
# central formulas,
 +
# clarifying figure,
 +
# plots and tables,
 +
# find the main idea.  
 +
And questions, what are:
 +
# the topic?
 +
# the subject of research?
 +
# the main idea or message?
 +
# the impact, is it useful for you?
  
You want to
+
===Abstract===
1) fit research activity,
+
The abstract of a paper is the first piece the reader looks at. Usually, it is written at the beginning of research and after the paper is done, before submission. Due to its importance, several versions of the abstract from different points of view are welcome.  
2) be part of the scientific community,
 
3) define your field of expertise. 
 
You aim to find your scientific advisor and carefully select your research project.
 
  
The task of project selection could be more difficult than the task of project accomplishment since it requires higher qualifications from the researcher. So, we split the whole thing into two parts. The first one is The Art.  The second one is called My First Scientific Paper. The second one is already more than 10 years old. It has a separate schedule, which starts the next spring. It is already well established,  unlike this new course so it is an experimental one. The organizers' goal is to boost the quality of student thesis works. 
+
The abstract is limited to 600 characters. It may contain
 +
# wide-range field of the investigated problem,
 +
# narrow problem to focus on,
 +
# features and conditions of the problem,
 +
# the idea of the suggested solution,
 +
# the novelty and alternative solutions to compare with,
 +
# application to illustrate with.
  
This student's works highly depend on how the student fits the scientific community and this kind of community must be organized in the following way. First, a student is a project driver, highly committed to their activity. Second, a consultant (usually it is a graduated student or um PhD student) who (an hour a week) helps the younger student. The third one is a professor an expert in the field who is responsible for the end result of the project. In the beginning, this professor states the problem and in the end, harvests the results. 
+
Examples of abstracts to discuss, [https://m1p.org/images/d/db/M1p_2024_lect2_c.pdf a draft].
 +
<!-- [http://www.machinelearning.ru/wiki/images/1/19/TheSecondSlide.pdf Think of a motivation of your research in slides]. -->
  
The production of scientific results and reporting it is our ultimate goal. To do it we have to learn
+
===Keywords===
how to state the problem 
+
The keywords of your paper shall match the subject of your research, and show the area and the focus. Ensure these keywords are used in your paper frequently and play an important role. They shall be recognized terms in your field of knowledge. See detailed [https://www.redwoodink.com/resources/how-to-choose-the-best-keywords-for-your-research-manuscript explanations] and Elsevier [https://scientific-publishing.webshop.elsevier.com/manuscript-preparation/how-choose-keywords-manuscript/ recommendations].
how to recognize if the project is feasible and 
 
how to present our results
 
We organize each seminar in the following way. 
 
First, there will be a minute test and a brief analysis. 
 
Second, the theoretical part of the theory will be about the style of scientific research and about some aspects of machine learning. 
 
After we talk about the homework and then discuss the homework we have done. 
 
Each semester comprises two modules. The first will be about how to deliver your message how to pitch your project, and the second one is about how to reason your project, and how to establish the theoretical part of it. 
 
  
You can fix your favorite project in the beginning but I recommend you change your subject of research each week to feel different subjects.
+
===Highlights===
This plan of classes is provisory but anyhow it coincides with the road map of a scientific paper preparation we select a paper to start from, discuss its principles, formulate results, collect a review, and prepare a small talk about it. 
+
To write highlights, see [https://www.elsevier.com/researcher/author/tools-and-resources/highlights elsevier] official version, a useful piece of
 +
[https://editingindia.wordpress.com/2015/07/14/writing-highlights-for-elsevier-dos-and-donts/ advice], and [https://medium.com/@miguel_93656/writing-meaningful-highlights-in-scientific-papers-4371ff33ab8a Medium] clarifications.
  
Here we walked through some themes to discuss. I hope it shrinks. Each our seminar will have a short theoretical part. But it will not introduce the methods themselves. It will introduce how to deliver your message to your reader and your audience in an easy and fast way.
+
==Papers to choose from==
 +
Please read a paper from this list and formulate its main message. Imagine you are a journal editor or a reliever, who receives scientific papers randomly and pick up some paper.  
  
The scoring. We would like to avoid the written exam, so there will be weekly scoring with tests at the beginning of a seminar, your talks at the end of a seminar, and your written homework. 
+
''If these papers are too difficult to you to understand'', there is no big deal. Most likely, you were going to read a paper of your own interest. Read it. The main requirements, it must be a scientific paper. See the next section.  
Of course, each module ends with a small course work and the first module is about ones-slide talk. Plus about two pages of project description. So each week you will get several points these points add up and are scaled. The deadlines are strict.
 
  
Since this course is a new one, we don't know how it ends. I hope everything will be fine in the case of a large group of students there will be no possibility of reviewing all your texts and listening to all your talks. So the feedback will be limited. 
+
You can briefly go through the bold items of [https://cseweb.ucsd.edu/~wgg/CSE210/howtoread.html How to Read an Engineering Research Paper by W.G. Griswold]
  
It is expected that you have a bachelor's degree.  But these first six items are about just the first two years of bachelor's study. And the last seventh item is about your basic knowledge of machine learning. We recommend the book of Christopher Bishop as a standard of machine learning knowledge.
+
'''IMPORTANT'''. Since the homework is to reconstruct the abstract of one of these papers, please, try to skip the published abstract. Cover it and start reading according to the discussed reading scheme.  
Here are the main references but in fact, these references will be spread over the homework and some of these books are quite large to read. So we will just point to their parts.
 
Is expected to be Saturday 2:40 p.m. 
 
  
We go to the Step Zero. 
+
# Distinguishing time-delayed causal interactions using convergent cross mapping [https://doi.org/10.1038/srep14750 DOI]
Please read this motivational text, read about the syllabus of this part, and the part of my first scientific paper.
+
# Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria [https://doi.org/10.1016/j.eswa.2017.01.048 DOI], [https://m1p.org/papers/Katrutsa2016QPFeatureSelection.pdf PDF]
 +
# Spatio-temporal filling of missing points in geophysical data sets [https://doi.org/10.5194/npg-13-151-2006 DOI]
 +
# Analytic and stochastic methods of structure parameter estimation [https://doi.org/10.15388/Informatica.2016.102 DOI]
 +
# Longitudinal predictive modeling of tau progression along the structural connectome [https://doi.org/10.1016/j.neuroimage.2021.118126 DOI]
 +
# Generative or Discriminative? Getting the Best of Both Worlds [https://www.microsoft.com/en-us/research/wp-content/uploads/2016/05/Bishop-Valencia-07.pdf PDF]
 +
# Neural Ordinary Differential Equations [https://proceedings.neurips.cc/paper_files/paper/2018/file/69386f6bb1dfed68692a24c8686939b9-Paper.pdf NIPS], [https://arxiv.org/pdf/1806.07366 Appendix]
 +
# Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations [https://doi.org/10.1016/j.jcp.2018.10.045 DOI], [https://github.com/maziarraissi/PINNs GitHub]
 +
# How much does it help to know what she knows you know? An agent-based simulation study [http://dx.doi.org/10.1016/j.artint.2013.05.004 DOI]
 +
# GRAND: Graph Neural Diffusion [https://proceedings.mlr.press/v139/chamberlain21a/chamberlain21a.pdf PMLR]
  
The will be a link to this video. I put a slow-speed version of this talk here.
+
===Can I select a paper of my own choice?===
Despite the fact I mentioned this is the orientation hour there will be homework. 
+
Yes. Here are some formal requirements.  
First of all please subscribe to attend to the course. Fill out the form and the form will collect your email this email will be your ID. Please do not change it unless you want to lose your course scores. The deadline is late it's September 20. You will have some time to think about your commitment. 
 
Your name how to address to you, University student group.
 
  
To engage yourself please think about your professional goal in connection with your future thesis.
+
# A clear message in the area of Machine Learning.
The required question why would you want to take this course?
+
# No Kaggle-style papers with messages like "It works, but nobody knows how".  
What is the profit in it for you? 
+
# Top peer-reviewed journals, no ArXiv, better avoid conferences.
 +
# No papers from other fields: linguistics, medicine, finance, physics, etc.
 +
# No overviews of paper collections, it is another genre.
 +
# No [https://en.wikipedia.org/wiki/Predatory_publishing predatory publishing houses]
  
You answer this and this will be the first homework. 
+
In this case please write an explanatory text about why you choose this paper.
Don't forget to press the submit button.
 
  
Also to keep your time in the future homework please set up some tools.
+
=== Recommended journals ===
If you are not familiar with LaTeX, please find a crash course on this and start reading it.
+
# [https://link.springer.com/journal/10994 Machine Learning]
We will keep our progress in two archives. The first one is the Google forms and the second one is your GitHub repository.
+
# [https://www.sciencedirect.com/journal/expert-systems-with-applications Expert Systems with Applications]
 +
# [https://www.jmlr.org/ Journal of Machine Learning Research]
 +
# [https://www.sciencedirect.com/journal/artificial-intelligence Artificial Intelligence]
 +
# [https://www.sciencedirect.com/journal/neurocomputing/vol/609/suppl/C Neurocomputing]
 +
# [https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=34 IEEE Transactions on Pattern Analysis and Machine Intelligence]
 +
# [https://www.sciencedirect.com/journal/neural-networks/ Neural Networks]
 +
# [https://www.sciencedirect.com/journal/pattern-recognition/vol/158/suppl/C Pattern Recognition]
 +
# [https://link.springer.com/journal/10618 Data Mining and Knowledge Discovery]
 +
# [https://www.nature.com/natmachintell/research-articles Nature Machine Inlelligence], the problem is the first word here is Nature so it focuses on natural sciences
  
Also if you're not familiar with the Bishop's book just start reading it and 
+
====See also====
For fun and for homework watch a lecture by Patrick Winston from MIT on how to speak. We will talk about it in the next class. 
+
# [https://www.elsevier.com/open-access/open-archive Elsevier's open archive]
 +
# [https://www.springeropen.com/collections?subject=Computer+Science Springer's open archive]
  
So fill out your questionnaire and see you next week.
+
<!--
 +
==Fun==
 +
To do))
 +
-->
  
 
+
==Transcript of the video==
 
+
Appears after the seminar.
Hey, the test will be about Patrick Winston's lecture.
 

Latest revision as of 00:47, 22 September 2024

The most important things come first. We discuss the main message, delivered by a scientific paper. We explore the first three elements of a scientific paper: the abstract, the highlights, and the keywords. The main message shall reveal itself through all elements of the paper. But we leave the rest of it for the next time. Namely, the title, introduction, problem statement, goal of the computational experiment, and conclusion are left behind. We select a paper and exercise in the reconstruction of these three elements.

The seminar

  1. The warm-up 3-minute test
  2. Model, Algorithm, Method: Machine learning in a nut-shell
  3. Step 1 homework, how to read: the scheme
  4. Structure of the abstract
  5. Extracting keywords
  6. Highlights: compressing the paper
  7. Instastructure for your homework
    • GitHub: organize the repository
    • LaTeX: compile your file and commit without temporary files
  8. The papers to select from
  9. Step 0 homework results discussion
  10. Optional GPT-role discussion

Resources

Step 1 YouTube video Warning! A wrong microphone was used. This video will be rewritten in a couple of days.

Homework

  1. Set up your GitHub repository using this template, see how
  2. Select a paper to read from the list below
  3. Reconstruct its
    1. Abstract
    2. Keywords
    3. Highlights
    4. Short motivation for why you selected this paper (no templates here, since it is an extra topic to discuss)
  4. Compile and upload TEX and PDF to GitHub (no temporary files, please)
  5. Fill out the Step 1 questionnaire
  6. Refresh in your memory the Linear models for the next warm-up test, either
    1. look for the terms dot product, scalar projection, least squares, linear map
    2. or do fun-reading, the pages 33-39 from the book Section L3.

Note that we always respect your credit hours. So please keep track of it.

Your profit here is your ability to find the main message of a paper.

How to read

There are many pieces of advice on how to read scientific papers, see an example. We suggest briefly looking through the paper's

  1. highlight, or pitch in the abstract,
  2. central formulas,
  3. clarifying figure,
  4. plots and tables,
  5. find the main idea.

And questions, what are:

  1. the topic?
  2. the subject of research?
  3. the main idea or message?
  4. the impact, is it useful for you?

Abstract

The abstract of a paper is the first piece the reader looks at. Usually, it is written at the beginning of research and after the paper is done, before submission. Due to its importance, several versions of the abstract from different points of view are welcome.

The abstract is limited to 600 characters. It may contain

  1. wide-range field of the investigated problem,
  2. narrow problem to focus on,
  3. features and conditions of the problem,
  4. the idea of the suggested solution,
  5. the novelty and alternative solutions to compare with,
  6. application to illustrate with.

Examples of abstracts to discuss, a draft.

Keywords

The keywords of your paper shall match the subject of your research, and show the area and the focus. Ensure these keywords are used in your paper frequently and play an important role. They shall be recognized terms in your field of knowledge. See detailed explanations and Elsevier recommendations.

Highlights

To write highlights, see elsevier official version, a useful piece of advice, and Medium clarifications.

Papers to choose from

Please read a paper from this list and formulate its main message. Imagine you are a journal editor or a reliever, who receives scientific papers randomly and pick up some paper.

If these papers are too difficult to you to understand, there is no big deal. Most likely, you were going to read a paper of your own interest. Read it. The main requirements, it must be a scientific paper. See the next section.

You can briefly go through the bold items of How to Read an Engineering Research Paper by W.G. Griswold

IMPORTANT. Since the homework is to reconstruct the abstract of one of these papers, please, try to skip the published abstract. Cover it and start reading according to the discussed reading scheme.

  1. Distinguishing time-delayed causal interactions using convergent cross mapping DOI
  2. Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria DOI, PDF
  3. Spatio-temporal filling of missing points in geophysical data sets DOI
  4. Analytic and stochastic methods of structure parameter estimation DOI
  5. Longitudinal predictive modeling of tau progression along the structural connectome DOI
  6. Generative or Discriminative? Getting the Best of Both Worlds PDF
  7. Neural Ordinary Differential Equations NIPS, Appendix
  8. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations DOI, GitHub
  9. How much does it help to know what she knows you know? An agent-based simulation study DOI
  10. GRAND: Graph Neural Diffusion PMLR

Can I select a paper of my own choice?

Yes. Here are some formal requirements.

  1. A clear message in the area of Machine Learning.
  2. No Kaggle-style papers with messages like "It works, but nobody knows how".
  3. Top peer-reviewed journals, no ArXiv, better avoid conferences.
  4. No papers from other fields: linguistics, medicine, finance, physics, etc.
  5. No overviews of paper collections, it is another genre.
  6. No predatory publishing houses

In this case please write an explanatory text about why you choose this paper.

Recommended journals

  1. Machine Learning
  2. Expert Systems with Applications
  3. Journal of Machine Learning Research
  4. Artificial Intelligence
  5. Neurocomputing
  6. IEEE Transactions on Pattern Analysis and Machine Intelligence
  7. Neural Networks
  8. Pattern Recognition
  9. Data Mining and Knowledge Discovery
  10. Nature Machine Inlelligence, the problem is the first word here is Nature so it focuses on natural sciences

See also

  1. Elsevier's open archive
  2. Springer's open archive


Transcript of the video

Appears after the seminar.