Vadim V. Strijov · ORC ID · Math-Net · GoogleScholar List of publications |
2022Isachenko R.V., Strijov V.V. Quadratic programming feature selection for multicorrelated signal decoding with partial least squares // Expert Systems with Applications, 2022, 207 : 117967. Article |
Abstract: This paper investigates the dimensionality reduction problem for signal decoding. Its main application is brain-computer interface modeling. The challenge is high redundancy in the data description. Data combines time series of two origins: design space: brain cortex signals and target space: limb motion signals. High correlations among measurements of complex signals lead to multiple correlations. This case studies correlations in input and target spaces that carry heterogeneous data. This paper proposes feature selection algorithms to construct a simple and stable forecasting model. It extends ideas of the quadratic programming feature selection approach and selects non-correlated features that are relevant to the target. The proposed methods take into account dependencies in both design and target space and select features which fit both spaces jointly. The computational experiment was carried out using an electrocorticogram (ECoG) dataset. The obtained models predict hand motions using signals of the brain cortex. The partial least squares (PLS) regression model is used as the base model for dimensionality reduction. The PLS algorithm obtains the best result, which reduces space dimensionality using the QPFS. |
BibTeX: @article{IsachenkoStrijov2022Decoding, author = {Isachenko, R. V. and Strijov, V. V.}, title = {Quadratic programming feature selection for multicorrelated signal decoding with partial least squares}, journal = {Expert Systems with Applications}, year = {2022}, volume = {207}, pages = {117967}, url = {/papers/isachenko2022qpfs_decoding.pdf}, doi = {10.1016/j.eswa.2022.117967} } |
Grabovoy A.V., Gadaev T.S., Motrenko A.P., Strijov V.V. Numerical methods of sufficient sample size estimation for generalised linear models // Lobachevskii Journal of Mathematics, 2022, 43 : 2453-2462. Article |
Abstract: This paper investigates the problem of cost reduction of data collection procedures. A sample set of minimum sufficient size must be collected to select an adequate regression or classification model. This sample set is modeled according to to follow the data generation hypotheses. Namely, the generalized linear regression models assume the independent and identically distributed target variable. The paper analyses several numerical methods of sample size estimation and compares them in practical terms. It includes statistic, heuristic, and Bayesian methods. The practical goal of a sample set collection is modeling. Some methods involve analysis of the model parameters. The computational experiment includes widely-used sample sets. The open-source code and the software are provided for the practitioners to use in the data collection planning. |
BibTeX: @article{Grabovoy2021SampleSize, author = {Grabovoy, A. V. and Gadaev, T. S. and Motrenko, A. P. and Strijov, V. V.}, title = {Numerical methods of sufficient sample size estimation for generalised linear models}, journal = {Lobachevskii Journal of Mathematics}, year = {2022}, volume = {43}, pages = {2453-2462}, url = {http://links.springernature.com/f/a/W3ZeXWVkoEFhJMHWyIrhKQ /AABE5gA /RgRljdB7P0RcaHR0cHM6Ly90cmVidWNoZXQucHVibGljLnNwcmluZ2VybmF0dXJlLmFwcC9nZXRfY29udGVudC85MDUyYjgyNS05MmFkLTQyMmItOWVjNy1jNzhmMmY1OGI3OGNXA3NwY0IKY6J7S6tjKeLdUlIUc3RyaWpvdkBwaHlzdGVjaC5lZHVYBAAABy0 }, doi = {10.1134/S1995080222120125} } |
Grabovoy A.V., Strijov V.V. Probabilistic Interpretation of the Distillation Problem // Automation and Remote Control, 2022, 83(1) : 123-137. Article |
Abstract: The article deals with methods for reducing the complexity of approximating models. Probabilistic substantiation of distillation and privileged teaching methods is proposed. General conclusions are given for an arbitrary parametric function with a predetermined structure. A theoretical basis is demonstrated for the special cases of linear and logistic regression. The analysis of the considered models is carried out in a computational experiment on synthetic samples and real data. The FashionMNIST and Twitter Sentiment Analysis samples are considered real data. |
BibTeX: @article{Grabovoy2021Distilling, author = {Grabovoy, A. V. and Strijov, V. V.}, title = {Probabilistic Interpretation of the Distillation Problem}, journal = {Automation and Remote Control}, year = {2022}, volume = {83}, number = {1}, pages = {123--137}, url = {https://trebuchet.public.springernature.app/get_content/4df58851-23f3-4ec8-95f8-30eff603197f}, doi = {10.1134/S000511792201009X} } |
Bazarova A.I., Grabovoy A.V., Strijov V.V. Analysis of the properties of probabilistic models in expert-augmented learning problems // Automation and Remote Control, 2022, 83 : 1527-1537. Article |
Abstract: The paper deals with the construction of interpretable machine learning models. The approximation problem is solved for a set of shapes on a contour image. Assumptions that the shapes are second-order curves are introduced. When approximating the shapes, information about the type, location, and shape of curves as well as about the set of their possible transformations is used. Such information is called expert information, and the machine learning method based on expert information is called expert-augmented learning. It is assumed that the set of shapes is approximated by the set of local models. Each local model based on expert information approximates one shape on the contour image. To construct the models, it is proposed to map second-order curves into a feature space in which each local model is linear. Thus, second-order curves are approximated by a set of linear models. In a computational experiment, the problem of approximating an iris on a contour image is considered. |
BibTeX: @article{B2021BayesianDistilationRu, author = {Bazarova A.I. and Grabovoy, A. V. and Strijov, V. V.}, title = {Analysis of the properties of probabilistic models in expert-augmented learning problems}, journal = {Automation and Remote Control}, year = {2022}, volume = {83}, pages = {1527-1537}, url = {https://link.springer.com/epdf/10.1134/S00051179220100058?sharing_token=hAPcnuIqzQzbt4k9e1mK60ckSORA_DxfnEvY7GoQybYVd6LPNBk87BsZksMeOmQTQkPHqNC0C0hhH4wgkIwUBXiYnzpFiL-xlzke_QsjGa9T079qlMNETVn8oSyj0Oa8YO234_op_q_nnSelEgSihsbSeTNLMy5eQfqTNKRAu-E=}, doi = {10.1134/S00051179220100058} } |
Gorpinich M., Bakhteev O.Y., Strijov V.V. Gradient Methods for Optimizing Metaparameters in the Knowledge Distillation Problem // Automation and Remote Control, 2022, 83(10) : 1544-1554. Article |
Abstract: The paper investigates the distillation problem for deep learning models. Knowledge distillation is a metaparameter optimization problem in which information from a model of a more complex structure, called a teacher model, is transferred to a model of a simpler structure, called a student model. The paper proposes a generalization of the distillation problem for the case of optimization of metaparameters by gradient methods. Metaparameters are the parameters of the distillation optimization problem. The loss function for such a problem is the sum of the classification term and the cross-entropy between the responses of the student model and the teacher model. Assigning optimal metaparameters to the distillation loss function is a computationally difficult task. The properties of the optimization problem are investigated to predict the metaparameter update trajectory. An analysis of the trajectory of the gradient optimization of metaparameters is carried out, and their value is predicted using linear functions. The proposed approach is illustrated using a computational experiment on CIFAR-10 and Fashion-MNIST samples and synthetic data. |
BibTeX: @article{Gorpinich_2022, author = {M. Gorpinich and O. Yu. Bakhteev and V. V. Strijov}, title = {Gradient Methods for Optimizing Metaparameters in the Knowledge Distillation Problem}, journal = {Automation and Remote Control}, year = {2022}, volume = {83}, number = {10}, pages = {1544--1554}, url = {https://trebuchet.public.springernature.app/get_content/8c4414a5-9e0f-461f-b2f5-d406954a9017}, doi = {10.1134/s00051179220100071} } |
Motrenko A., Simchuk E., Khairullin R., Inyakin A., Kashirin D., Strijov V.V. Continuous physical activity recognition for intelligent labour monitoring // Multimedia Tools and Applications, 2022, 81(4) : 4877-4895. Article |
Abstract: The paper addresses the problem of human activity recognition based on data from wearable sensors. Human activity recognition depends on a wide context of actions. Activities can not be recognized from the local shape of sensor signals only. We propose a solution to the problem of human activity recognition applied to labour monitoring. The solution is based on the hierarchical representation of activities as sets of low-level actions. Viewing activities as sequences of actions allows exploring activities in a more condensed representation than time series. The hierarchical representation provides an interpretable description of studied activities in terms of actions. To obtain this hierarchical representation, one must first solve the problem of low-level action recognition. Though widely studied, the problem of action recognition requires overcoming several difficulties. Firstly, we show that using noise-aware self-learning methods can significantly improve classification quality in human activity recognition. Since time series are human-labeled, errors are inevitable and abundant. Noisy labels significantly worsen classification quality. Noise-aware learning allows for relaxing requirements for labeling precision and lower annotation costs. Secondly, we propose an algorithm of automatic pattern selection to generate low-level descriptions as an alternative in an unsupervised manner. The proposed method is based on Eamonn Keogh's time series indexing methods. We introduce local PCA projections to make the method more robust to spatial rotations of a wearable device. |
BibTeX: @article{motrenko2020continous, author = {Motrenko, Anastasia and Simchuk, Egor and Khairullin, Renat and Inyakin, Andrey and Kashirin, Daniil and Strijov, Vadim Victor}, title = {Continuous physical activity recognition for intelligent labour monitoring}, journal = {Multimedia Tools and Applications}, year = {2022}, volume = {81}, number = {4}, pages = {4877--4895}, url = {https://doi.org/10.1007/s11042-021-11288-y}, doi = {10.1007/s11042-021-11288-y} } |
Neychev R.G., Shibaev I.A., Strijov V.V. Optimal spanning tree reconstruction in symbolic regression // Informatics and Applications, 2022. Article |
Abstract: This paper investigates the problem of regression model generation. A model is a superposition of primitive functions. The model structure is described by a weighted colored graph. Each graph vertex corresponds to some primitive function. An edge assigns a superposition of two functions. The weight of an edge equals the probability of superposition. To generate an optimal model one has to reconstruct its structure from its graph adjacency matrix. The proposed algorithm reconstructs the minimum spanning tree from the weighted colored graph. This paper presents a novel solution based on the prize-collecting Steiner tree algorithm. This algorithm is compared with its alternatives. |
BibTeX: @article{Shibaev2021Graph, author = {Radoslav G. Neychev and Innokentiy A. Shibaev and Vadim V. Strijov}, title = {Optimal spanning tree reconstruction in symbolic regression}, journal = {Informatics and Applications}, year = {2022}, url = {/papers/Shibaev2022Symbolic.pdf} } |
Yakovlev K., Grebenkova O., Bakhteev O., Strijov V. Neural Architecture Search with Structure Complexity Control // EasyChair, 2022. Article |
Abstract: The paper investigates the problem of deep learning model selection. The authors propose a method of a neural architecture search with respect to its desired complexity. As a complexity, we consider a number of parameters that use selected architecture. The method is based on a differential architecture search algorithm (DARTS). Instead of optimizing structural parameters of the architecture, we consider them as a function depending on the complexity parameter. To evaluate the quality of the proposed algorithm, we conduct experiments on the Fashion-MNIST and CIFAR-10 datasets and compare the resulting architecture with DARTS method. |
BibTeX: @article{Yakovlev2022EasyChair:7973, author = {Konstantin Yakovlev and Olga Grebenkova and Oleg Bakhteev and Vadim Strijov}, title = {Neural Architecture Search with Structure Complexity Control}, journal = {EasyChair}, year = {2022}, url = {https://yahootechpulse.easychair.org/publications/preprint_download/H5MC}, doi = {https://easychair.org/publications/preprint/H5MC} } |
Samokhina A.M., Goncharenko V.V., Grigoryan R.K., Strijov V.V. Classification models for P300 evoked potentials // Systems and Means of Informatics, 2022, 32(3) : 36-49. Article Rus |
Abstract: The paper is devoted to the problem of user attention detection. It investigates the choice of a visual stimulus by the electroencephalogram (EEG) with the evoked potentials related to the event, P300, highlighted in it. The electrical brain potentials are measured while the user is observing visual stimuli. The goal is to select a stimulus that causes the maximum brain response. A classification model detects if there is a P300 potential in an EEG segment. Various classification models for event-related potentials are compared. The paper proposes a data augmentation method to improve classification quality. Computational experiments use an original real-world dataset of P300 potentials. This dataset was collected from 60 healthy users who are presented with visual stimuli. It is released to public access. |
BibTeX: @article{Samokhina2022P300, author = {Samokhina, A. M. and Goncharenko, V. V. and Grigoryan, R. K. and Strijov, V. V.}, title = {Classification models for P300 evoked potentials}, journal = {Systems and Means of Informatics}, year = {2022}, volume = {32(3)}, pages = {36-49}, note = {36-}, url = {/papers/Samokhina2022P300.pdf}, doi = {10.14357/08696527220304} } |
Grabovoy A.V. Expert learning and Bayesian multi-modelling. Moscow Institute of Physics and Technologu, 2022. PhdThesis Rus |
Abstract: The paper investigates a mixture of expert models. The mixture of experts is a combination of experts, local approximation model, and a gate function, which weighs these experts and forms their ensemble. In this work, each expert is a linear model. The gate function is a neural network with soft- max on the last layer. The paper analyzes various prior distributions for each expert. The authors propose a method that takes into account the relationship between prior distributions of different experts. The EM algorithm optimises both parameters of the local models and parameters of the gate function. As an application problem, the paper solves a problem of shape recognition on images. Each expert fits one circle in an image and recovers its parameters: the coordinates of the center and the radius. The computational experiment uses synthetic and real data to test the proposed method. The real data is a human eye image from the iris detection problem. |
BibTeX: @phdthesis{Grabovoy2022PhDThesis, author = {Grabovoy, A. V.}, title = {Expert learning and Bayesian multi-modelling}, school = {Moscow Institute of Physics and Technologu}, year = {2022}, url = {https://www.youtube.com/watch?v=h0K4sKhS9-w}, doi = {https://github.com/andriygav/PhDThesis} } |
2021Grabovoy A.V., Strjov V.V. Bayesian Distillation of Deep Learning Models // Automation and Remote Control, 2021, 82 : 1846-1856. Article |
Abstract: We study the problem of reducing the complexity of approximating models and consider methods based on distillation of deep learning models. The concepts of trainer and student are introduced. It is assumed that the student model has fewer parameters than the trainer model. A Bayesian approach to the student model selection is suggested. A method is proposed for assigning an a priori distribution of student parameters based on the a posteriori distribution of trainer model parameters. Since the trainer and student parameter spaces do not coincide, we propose a mechanism for the reduction of the trainer model parameter space to the student model parameter space by changing the trainer model structure. A theoretical analysis of the proposed reduction mechanism is carried out. A computational experiment was carried out on synthesized and real data. The FashionMNIST sample was used as real data. |
BibTeX: @article{Grabovoy2021Distilling, author = {Grabovoy, A. V. and Strjov, V. V.}, title = {Bayesian Distillation of Deep Learning Models}, journal = {Automation and Remote Control}, year = {2021}, volume = {82}, pages = {1846-1856}, url = {/papers/Grabovoy2021BayesianDistilation.pdf}, doi = {10.1134/S0005117921110023} } |
Grabovoy A.V., Strijov V.V. Prior distribution selection for a mixture of experts // Computational Mathematics and Mathematical Physics, 2021, 61(7) : 1149-1161. Article |
Abstract: The paper investigates a mixture of expert models. The mixture of experts is a combination of experts, local approximation model, and a gate function, which weighs these experts and forms their ensemble. In this work, each expert is a linear model. The gate function is a neural network with softmax on the last layer. The paper analyzes various prior distributions for each expert. The authors propose a method that takes into account the relationship between prior distributions of different experts. The EM algorithm optimises both parameters of the local models and parameters of the gate function. As an application problem, the paper solves a problem of shape recognition on images. Each expert fits one circle in an image and recovers its parameters: the coordinates of the center and the radius. The computational experiment uses synthetic and real data to test the proposed method. The real data is a human eye image from the iris detection problem. |
BibTeX: @article{GrabovoyStrijov2020ExpertLearning, author = {Grabovoy, A. V. and Strijov, V. V.}, title = {Prior distribution selection for a mixture of experts}, journal = {Computational Mathematics and Mathematical Physics}, year = {2021}, volume = {61(7)}, pages = {1149-1161}, url = {/papers/GrabovoyStrijov2020ExpertLearning.pdf}, doi = {10.1134/S0965542521070071} } |
Kuzmin A.A., Aduenko A.A., Strijov V.V. Hierarchical thematic classification of major conference proceedings // CICLing, 2021. Article |
Abstract: In this paper we develop a decision support system for the hierarchical text classification. We consider text collections with fixed hierarchical structure of topics given by experts in the form of a tree. The system sorts the topics by relevance to a given document. The experts choose one of the most relevant topic to finish the classification. We propose a weighted hierarchical similarity function to calculate topic relevance. The function calculates similarity of a document and a tree branch. The weights in this function determine word importance. We use the entropy of words to estimate the weights. The proposed hierarchical similarity function formulate a joint hierarchical thematic classification probability model of the document topics, parameters, and hyperparameters. The variational bayesian inference gives a closed form EM algorithm. The EM algorithm estimates the parameters and calculates the probability of a topic for a given document. Compared to hierarchical multiclass SVM, hierarchical PLSA with adaptive regularization, and hierarchical naive bayes, theweighted hierarchical similarity function has better improvement in ranking accuracy in an abstracts collection of a major conference EURO and a web sites collection of industrial companies. |
BibTeX: @article{Kuzmin2018Similarity, author = {Kuzmin, A. A. and Aduenko, A. A. and Strijov, V. V.}, title = {Hierarchical thematic classification of major conference proceedings}, journal = {CICLing}, year = {2021}, url = {/papers/Kuzmin2017HierarchicalThematic.pdf} } |
Grebenkova O.S., Bakhteev O.Y., Strijov V.V. Variational deep learning model optimization with complexity control // Informatics and Applications, 2021, 15(1) : 42-49. Article Rus |
Abstract: This paper investigates the problem of the deep learning model optimization. We propose a method to control the model complexity. The minimum description length is interpreted as the complexity of the model. It acts as the minimal amount of information that is required to transfer information about the model and the dataset. The proposed method is based on the representation of deep learning model. We propose the form of a hypernet using the Bayesian inference. A hypernet is a model that generates parameters of an optimal model. We introduce a probabilistic assumptions about the distribution of parameters of the deep learning model. The paper suggests maximizing the evidence lower bound of the Bayesian model validity. We consider the evidence bound as a conditional value that depends on the required model complexity. We analyze this method in the computational experiments on the MNIST dataset. |
BibTeX: @article{Grebenkova2020HyperNet, author = {Grebenkova, O. S. and Bakhteev, O. Yu. and Strijov, V. V.}, title = {Variational deep learning model optimization with complexity control}, journal = {Informatics and Applications}, year = {2021}, volume = {15(1)}, pages = {42-49}, url = {/papers/Grebenkova2020HyperNet.pdf}, doi = {10.14357/19922264210106} } |
Yaushev F.Y., Isachenko R.V., Strijov V.V. Concordant models for latent space projections in forecasting // Systems and Means of Informatics, 2021, 31(1) : 4-16. Article Rus |
Abstract: The paper examines the problem of predicting a complex structured target variable. Complexity refers to the presence of dependencies, whether linear or non-linear. The source data is assumed to be heterogeneous. This means that the spaces of the independent and target variables are of different nature. It is proposed to build a predictive model that takes into account the dependence in the input space of the independent variable, as well as in the space of the target variable. It is proposed to make model agreement procedure in a low-dimensional latent space. The projection to latent space method is used as the basic algorithm. The paper compares the linear and proposed nonlinear models. The comparison is performed on heterogeneous data in high-dimensional spaces. |
BibTeX: @article{Isachenko2020CanonicCorrelation, author = {Yaushev, F. Yu. and Isachenko, R. V. and Strijov, V. V.}, title = {Concordant models for latent space projections in forecasting}, journal = {Systems and Means of Informatics}, year = {2021}, volume = {31(1)}, pages = {4-16}, url = {/papers/Isachenko2020CanonicCorrelation.pdf}, doi = {10.14357/08696527210101} } |
Vorontsov K., Iglovikov V., Strijov V., Ustuzhanin A., Khritankov A. Challenges in repeatable experiments and reproducible research in data science // Proceedings of MIPT. MIPT, 2021, 13(2) : 100-108. Article Rus |
Abstract: This article provides a summary of the results of a roundtable discussion on experimental design and research reproducibility in data science. A distinction is made between scientific and applied research, the issues of determining the quality for both types of research are considered, which is the essence of the reproducibility of the results in each case. In addition, an attempt is made to determine the directions for further development of infrastructure and methodology for the development of predictive models, algorithms and experiments. The recommendations formulated can be useful for the development of machine learning courses curricula. |
BibTeX: @article{Khritankov2021Experiment, author = {Vorontsov, Konstantin and Iglovikov, Vladimir and Strijov, Vadim and Ustuzhanin, Andrey and Khritankov, Anton}, title = {Challenges in repeatable experiments and reproducible research in data science}, journal = {Proceedings of MIPT}, publisher = {MIPT}, year = {2021}, volume = {13(2)}, pages = {100-108}, url = {https://mipt.ru/upload/medialibrary/a96/09.pdf}, doi = {10.31857/S0005231021100019} } |
Isachenko R.V. Dimensionality reduction for signal decoding (PhD thesis supervised by V.V. Strijov). Moscow Institute of Physics and Technology, 2021. PhdThesis Rus Submitted for defence in 2021. |
Abstract: The thesis work investigates the problem space dimensionality reduction to solve the problem of signal decoding. The decoding process consists in restoring the relationship between two heterogeneous data sets. The predictive model predicts a set of target signals from a set of source signals. |
BibTeX: @phdthesis{Isachenko2021PhDThesis, author = {Isachenko, R. V.}, title = {Dimensionality reduction for signal decoding (PhD thesis supervised by V.V. Strijov)}, school = {Moscow Institute of Physics and Technology}, year = {2021}, url = {https://github.com/r-isachenko/PhDThesis/raw/master/doc/Isachenko2021PhDThesis.pdf}, doi = {https://github.com/r-isachenko/PhDThesis} } |
2020Bakhteev O.Y., Strijov V.V. Comprehensive analysis of gradient-based hyperparameter optimization algorithmss // Annals of Operations Research, 2020 : 1-15. Article |
Abstract: The paper investigates hyperparameter optimization problem. Hyperparameters are the parameters of model parameter distribution. The adequate choice of hyperparameter values prevents model overfit and allows it to obtain higher predictive performance. Neural network models with large amount of hyperparameters are analyzed. The hyperparameter optimization for models is computationally expensive. The paper proposes modifications of various gradient-based methods to simultaneously optimize many hyperparameters. The paper compares the experiment results with the random search. The main impact of the paper is hyperparameter optimization algorithms analysis for the models with high amount of parameters. To select precise and stable models the authors suggest to use two model selection criteria: cross-validation and evidence lower bound. The experiments show that the models optimized using the evidence lower bound give higher error rate than the models obtained using cross-validation. These models also show greater stability when data is noisy. The evidence lower bound usage is preferable when the model tends to overfit or when the cross-validation is computationally expensive. The algorithms are evaluated on regression and classification datasets. |
BibTeX: @article{Bakhteev2017HypergradEn4, author = {Bakhteev, O. Y. and Strijov, V. V.}, title = {Comprehensive analysis of gradient-based hyperparameter optimization algorithmss}, journal = {Annals of Operations Research}, year = {2020}, pages = {1-15}, url = {/papers/Bakhteev2017Hypergrad.pdf}, doi = {10.1007/s10479-019-03286-z} } |
Grabovoy A.V., Strijov V.V. Quasi-periodic time series clustering for human activity recognition // Lobachevskii Journal of Mathematics, 2020, 41 : 333-339. Article |
Abstract: This paper analyses the periodic signals in the time series to recognize human activity by using a mobile accelerometer. Each point in the timeline corresponds to a segment of historical time series. This segments form a phase trajectory in phase space of human activity. The principal components of segments of the phase trajectory are treated as feature descriptions at the point in the timeline. The paper introduces a new distance function between the points in new feature space. To reval changes of types of the human activity the paper proposes an algorithm. This algorithm clusters points of the timeline by using a pairwise distances matrix. The algorithm was tested on synthetic and real data. This real data were obtained from a mobile accelerometer. |
BibTeX: @article{Grabovoy2019QuasiPeriodicTimeSeries, author = {Grabovoy, A. V. and Strijov, V. V.}, title = {Quasi-periodic time series clustering for human activity recognition}, journal = {Lobachevskii Journal of Mathematics}, year = {2020}, volume = {41}, pages = {333-339}, url = {/papers/Grabovoy2019QuasiPeriodicTimeSeries.pdf}, doi = {10.1134/S1995080220030075} } |
Nikitin F., Isayev O., Strijov V. DRACON: disconnected graph neural network for atom mapping in chemical reactions // Physical Chemistry Chemical Physics, 2020, 22 : 26478-26486. Article |
Abstract: Machine learning solved many challenging problems in computer-assisted synthesis prediction (CASP). We formulate a reaction prediction problem in terms of node-classification in a disconnected graph of source molecules and generalize a graph convolution neural network for disconnected graphs. Here we demonstrate that our approach can successfully predict reaction outcome and atom-mapping during a chemical transformation. A set of experiments using the USPTO dataset demonstrates excellent performance and interpretability of the proposed model. Implicitly learned latent vector representation of chemical reactions strongly correlates with the class of the chemical reaction. Reactions with similar templates group together in the latent vector space. |
BibTeX: @article{NikitinIsaevStrijov2020Dracon, author = {Filipp Nikitin and Olexandr Isayev and Vadim Strijov}, title = {DRACON: disconnected graph neural network for atom mapping in chemical reactions}, journal = {Physical Chemistry Chemical Physics}, year = {2020}, volume = {22}, pages = {26478-26486}, url = {https://chemrxiv.org/engage/chemrxiv/article-details/60c74e0f9abda2cf1af8d58a}, doi = {10.1039/D0CP04748A} } |
Usmanova K.R., Zhuravev Y.I., Rudakov K.V., Strijov V.V. Approximation of quasiperiodic signal phase trajectory using directional regression // Computational Mathematics and Cybernetics, 2020, 44 : 196-202. Article |
Abstract: This paper solves the phase trajectory approximation problem. Quasiperiodic time series form its trajectory in high dimensional space. The trajectory is represented in the spherical coordinate system. To approximate the trajectory the authors use a directional regression technique. It finds space of minimal dimension with the phase trajectory has no self-intersections. Its self-intersections defined within the standard deviation of the reconstructed trajectory. The experiment was conducted on two data sets: data of electricity consumption during the year and sensor data of the accelerometer while walking and running. |
BibTeX: @article{Usmanova2020Directional, author = {Usmanova, K. R. and Zhuravev, Yu. I. and Rudakov, K. V. and Strijov, V. V.}, title = {Approximation of quasiperiodic signal phase trajectory using directional regression}, journal = {Computational Mathematics and Cybernetics}, year = {2020}, volume = {44}, pages = {196--202}, url = {/papers/Usmanova2020Directional.pdf}, doi = {10.3103/S0278641920040068} } |
Goncharov A.V., Strijov V.V. Alignment of ordered set cartesian product // Informatics and Applications, 2020, 14(1) : 31-39. Article Rus |
Abstract: The work is devoted to the study of metric methods for analyzing objects with complex structure. It proposes to generalize the dynamic time warping method of two time series for the case of objects defined on two or more time axes. Such objects are matrices in the discrete representation. The DTW method of time series is generalized as a method of matrices dynamic alignment. Paper proposes a distance function resistant to monotonic nonlinear deformations of the Cartesian product of two time scales. The alignment path between objects is defined. An object is called a matrix in which the rows and columns correspond to the axes of time. The properties of the proposed distance function are investigated. To illustrate the method, the problems of metric classification of objects are solved on model data and data from the MNIST dataset. |
BibTeX: @article{Goncharov2019mDTW, author = {Goncharov, A. V. and Strijov, V. V.}, title = {Alignment of ordered set cartesian product}, journal = {Informatics and Applications}, year = {2020}, volume = {14(1)}, pages = {31-39}, url = {/papers/Goncharov2019mDTW.pdf}, doi = {10.14357/19922264200105} } |
Grabovoy A.V., Bakhteev O.Y., Strijov V.V. Ordering the set of neural network parameters // Informatics and Applications, 2020, 14(2) : 58-65. Article Rus |
Abstract: This paper investigates a method for setting order on a set of the model parameters. It considers linear models and neural networks. The set is ordered by the covariance matrix of the gradients. It is proposed to use a given order to freeze the model parameters during the optimization procedure. It is assumed that after few iterations of the optimization algorithm, most of the model parameters can be frozen without significant loss of the model quality. It reduces the dimensionality of the optimization problem. This method is analyzed in the computational experiment on the real data. The proposed order is compared with the random order on the set of the model parameters. |
BibTeX: @article{Grabovoy2019FindGoodParameters, author = {Grabovoy, A. V. and Bakhteev, O. Yu. and Strijov, V. V.}, title = {Ordering the set of neural network parameters}, journal = {Informatics and Applications}, year = {2020}, volume = {14(2)}, pages = {58-65}, url = {/papers/Grabovoy2019FindGoodParameters.pdf}, doi = {10.14357/19922264200208} } |
Potanin M.S., Vayser K.O., Zholobov V.A., Strijov V.V. Deep learning model structure optimization // Informatics and Applications, 2020, 14(4) : 55-62. Article Rus |
Abstract: The paper investigates optimal model structure selectionproblem. The model is a superposition of generalized linear models.Its elements are linear regression, logistic regression, principalcomponents analysis, autoencoder and neural network. Modelstructure refers to values of structural parameters that determinethe form of final superposition. This paper analyzes model structureselection method and investigates dependence of accuracy, complexityand stability of model on it. The paper proposes an algorithm for selection of neural network optimal structure. The proposedmethod was tested on real and synthetic data. Experiment results in significant structural complexity reduction of model while maintainingthe accuracy of approximation. |
BibTeX: @article{Potanin2020DLGenetic, author = {Potanin, M. S. and Vayser, K. O. and Zholobov, V. A. and Strijov, V. V.}, title = {Deep learning model structure optimization}, journal = {Informatics and Applications}, year = {2020}, volume = {14(4)}, pages = {55-62}, url = {/papers/Potanin2020DLGenetic.pdf}, doi = {10.14357/19922264200408} } |
Bakhteev O.Y. Suboptimal complexity deep learning model selection (PhD thesis supervised by V.V. Strijov). Moscow Institute of Physics and Technology, 2020. PhdThesis Rus |
Abstract: Hyperparameters are the parameters of model parameter distribution. The adequate choice of hyperparameter values prevents model overfit and allows it to obtain higher predictive performance. Neural network models with large amount of hyperparameters are analyzed. The hyperparameter optimization for models is computationally expensive. The paper proposes modifications of various gradient-based methods to simultaneously optimize many hyperparameters. The paper compares the experiment results with the random search. The main impact of the paper is hyperparameter optimization algorithms analysis for the models with high amount of parameters. To select precise and stable models the authors suggest to use two model selection criteria: crossvalidation and evidence lower bound. The algorithms are evaluated on regression and classification datasets. |
BibTeX: @phdthesis{Bakhteev2020ModelSelectionPhD, author = {Bakhteev, O. Yu.}, title = {Suboptimal complexity deep learning model selection (PhD thesis supervised by V.V. Strijov)}, school = {Moscow Institute of Physics and Technology}, year = {2020}, url = {http://www.frccsc.ru/sites/default/files/docs/ds/002-073-05/diss/26-bahteev/ds05-26-bahteev_main.pdf?28}, doi = {https://github.com/bahleg/tex_phd/raw/master/doc/BakhteevThesis.pdf} } |
2019Anikeev D.A., Penkin G.O., Strijov V.V. Local approximation models for human physical activity classification // Informatics and Applications, 2019, 13(1) : 40-48. Article Rus |
Abstract: The problem of classification of time series of an accelerometer of a mobile phone is investigated. The physical activity class corresponds to a time series segment. Segment is associated with its feature description. It is generated by an approximating spline. The elements of the feature vector are the coefficients of the basic spline functions. The computational experiment finds the optimal approximation parameters and parameters of the classification model according to the maximum likelihood of the logistic classification model. |
BibTeX: @article{AnikeyevPenkin2017Splines, author = {Anikeev, D. A. and Penkin, G. O. and Strijov, V. V.}, title = {Local approximation models for human physical activity classification}, journal = {Informatics and Applications}, year = {2019}, volume = {13(1)}, pages = {40-48}, url = {/papers/AnikeyevPenkin2017Splines.pdf}, doi = {10.14357/19922264190106} } |
Grabovoy A.V., Bakhteev O.Y., Strijov V.V. Estimation of relevance for neural network parameters // Informatics and Applications, 2019, 13(2) : 62-70. Article Rus |
Abstract: This paper investigates a method for optimizing the structure of a neural network. It assumes that the number of neural network parameters can be reduced without significant loss of quality and without significant increase in the variance of the loss function. The paper proposes a method for automatic estimation of the relevance of parameters to prune a neural network. This method analyzes the covariance matrix of the posteriori distribution of the model parameters and removes the least relevant and multicorrelate parameters. It uses the Belsly method to search for multicorrelation in the neural network. The proposed method was tested on the Boston Housing data set, the Wine data set, and synthetic data. |
BibTeX: @article{Grabovoy2018OptimalBrainDamage, author = {Grabovoy, A. V. and Bakhteev, O. Yu. and Strijov, V. V.}, title = {Estimation of relevance for neural network parameters}, journal = {Informatics and Applications}, year = {2019}, volume = {13(2)}, pages = {62-70}, url = {/papers/Grabovoy2018OptimalBrainDamage.pdf}, doi = {10.14357/19922264190209} } |
Usmanova K.R., Strijov V.V. Time series dependencies detection to construct forecasting models // Systems and Means of Informatics, 2019, 29(2) : 12-30. Article Rus |
Abstract: The problem of forecasting requires relationship between multiple time series. Engagement of related time series in a forecast model boosts the forecast quality. This paper introduces the convergent cross mapping method to establish a relationship between time series. This method estimates accuracy of reconstruction of one time series using the other series. The CCM detects relationship between series not only in full trajectory spaces, but in trajectory subspaces. The computational experiment is carried out on two sets of time series: electricity consumption and air temperature, oil transportation volume and oil production volume. |
BibTeX: @article{Usmanova2018CCM, author = {Usmanova, K. R. and Strijov, V. V.}, title = {Time series dependencies detection to construct forecasting models}, journal = {Systems and Means of Informatics}, year = {2019}, volume = {29(2)}, pages = {12-30}, url = {/papers/Usmanova2018CCM.pdf}, doi = {10.14357/08696527190202} } |
Motrenko A.P. Model selection for multicorrelated time series (PhD thesis supervised by V.V. Strijov). Moscow Institute of Physics and Technology, 2019. PhdThesis Rus |
Abstract: We solve the problem of feature selection in regression models in application to ECoG-based motion decoding. The task is to predict hand trajectories from the voltage time series of cortical activity. Feature description of a each point resides in spatial-temporal-frequency domain and include the voltage time series themselves and their spectral characteristics. Feature selection is crucial for adequate solution of this regression problem, since electrocorticographic data is highly dimensional and the measurements are correlated both in time and space domains. We propose a multi-way formulation of quadratic programming feature selection (QPFS), a recent approach to filtering-based feature selection proposed by Katrutsa and Strijov, -Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria-. QPFS incorporates both estimates of similarity between features, and their relevance to the regression problem, and allows an effective way to leverage them by solving a quadratic program. Our modification allows to apply this approach to multi-way data. We show that this modification improves prediction quality of resultant models. |
BibTeX: @phdthesis{Motrenko2019ModelSelectionPhD, author = {Motrenko, A. P.}, title = {Model selection for multicorrelated time series (PhD thesis supervised by V.V. Strijov)}, school = {Moscow Institute of Physics and Technology}, year = {2019}, url = {https://sourceforge.net/p/mlalgorithms/code/HEAD/tree/PhDThesis/Motrenko/doc/Motrenko2018Thesis.pdf?format=raw} } |
2018Aduenko A.A., Motrenko A.P., Strijov V.V. Object selection in credit scoring using covariance matrix of parameters estimations // Annals of Operations Research, 2018, 260(1-2) : 3-21. Article |
Abstract: We address the problem of outlier detection for more reliable credit scoring. Scoring models are used to estimate the probability of loan default based on the customer�s application. To get an unbiased estimation of the model parameters one must select a set of informative objects (customers). We propose an object selection algorithm based on analysis of the covariance matrix for the estimated parameters of the model. To detect outliers we introduce a new quality function called specificity measure. For common practical case of ill-conditioned covariance matrix we suggest an empirical approximation of specificity. We illustrate the algorithm with eight benchmark datasets from the UCI machine learning repository and several artificial datasets. Computational experiments show statistical significance of the classification quality improvement for all considered datasets. The method is compared with four other widely used methods of outlier detection: deviance, Pearson and Bayesian residuals and gamma plots. Suggested method performs generally better for both clustered and non-clustered outliers. The method shows acceptable outlier discrimination for datasets that contain up to 30-40% of outliers. |
BibTeX: @article{Aduenko-Strijov2014ObjectSelection, author = {Aduenko, A. A. and Motrenko, A. P. and Strijov, V. V.}, title = {Object selection in credit scoring using covariance matrix of parameters estimations}, journal = {Annals of Operations Research}, year = {2018}, volume = {260(1-2)}, pages = {3-21}, url = {/papers/AduenkoObjectSelection_RV.pdf}, doi = {10.1007/s10479-017-2417-3} } |
Bakhteev O.Y., Strijov V.V. Deep learning model selection of suboptimal complexity // Automation and Remote Control, 2018, 79(8) : 1474-1488. Article |
Abstract: We consider the problem of model selection for deep learning models of suboptimal complexity. The complexity of a model is understood as the minimum description length of the combination of the sample and the classification or regression model. Suboptimal complexity is understood as an approximate estimate of the minimum description length, obtained with Bayesian inference and variational methods. We introduce probabilistic assumptions about the distribution of parameters. Based on Bayesian inference, we propose the likelihood function of the model. To obtain an estimate for the likelihood, we apply variational methods with gradient optimization algorithms. We perform a computational experiment on several samples. |
BibTeX: @article{Bakhteev2017Evidence, author = {Bakhteev, O. Y. and Strijov, V. V.}, title = {Deep learning model selection of suboptimal complexity}, journal = {Automation and Remote Control}, year = {2018}, volume = {79(8)}, pages = {1474-1488}, url = {https://link.springer.com/content/pdf/10.1134%2FS000511791808009X.pdf}, doi = {10.1134/S000511791808009X} } |
Goncharov A.V., Strijov V.V. Analysis of dissimilarity set between time series // Computational Mathematics and Modeling, 2018, 29(3) : 359-366. Article |
Abstract: This paper investigates the metric time series classification problem. Distance functions between time series are constructed using the dynamic time warping method. This method aligns two time series and builds a dissimilarity set. The vector-function of distance between the time series is a set of statistics. It describes the distribution of the dissimilarity set. The object feature describtion in the classification problem is set of selected statistics values of the dissimilarity set. It is built between the object and all the reference objects. The additional information about the dissimilarity distribution improves the classification quality. We propose classification method and demonstrate its result on the classification problem of the human physical activity time series from the mobile phone accelerometer. |
BibTeX: @article{Goncharov2017Analysis, author = {A. V. Goncharov and V. V. Strijov}, title = {Analysis of dissimilarity set between time series}, journal = {Computational Mathematics and Modeling}, year = {2018}, volume = {29(3)}, pages = {359-366}, url = {/papers/Goncharov2017Analysis.pdf}, doi = {10.1007/s10598-018-9415-4} } |
Isachenko R.V., Bochkarev V.V., Zharikov I.N., Strijov V.V. Feature Generation for Physical Activity Classification // Artificial Intelligence and Decision Making, 2018, 3 : 20-27. Article |
Abstract: The paper investigates the human physical activity classification problem. Time series from accelerometer of a wearable device produce a dataset. Due to high dimension of the object description and low computational resources one has to state a feature generation problem. The authors propose to use parameters of the local approximation models as informative features. The experiment is conducted on two datasets for human activity recognition using accelerometer: WISDM and USC-HAD. It compares several superpositions of various generation and classification models. |
BibTeX: @article{Isachenko2018Activity, author = {Isachenko, R. V. and Bochkarev, V.V. and Zharikov, I. N. and Strijov, V. V.}, title = {Feature Generation for Physical Activity Classification}, journal = {Artificial Intelligence and Decision Making}, year = {2018}, volume = {3}, pages = {20-27}, url = {/papers/Isachenko2018AccelerometerAIDM.pdf} } |
Isachenko R.V., Vladimirova M.R., Strijov V.V. Dimensionality reduction for time series decoding and forecasting problems // DEStech Transactions on Computer Science and Engineering, 2018, 27349 : 286-296. Article |
Abstract: The paper is devoted to the problem of decoding multiscaled time series and forecasting. The goal is to recover the dependence between input signal and target response. The proposed method allows to receive predicted values not for the next time stamp but for the whole range of values in forecast horizon. The prediction is multidimensional target vector instead of one timestamp point. We consider the linear model of partial least squares (PLS).The method finds the matrix of a joint description for the design matrix and the outcome matrix. The obtained latent space of the joint descriptions is low-dimensional. This leads to a simple, stable predictive model. We conducted computational experiments on the real data of energy consumption and electrocorticograms signals (ECoG). The experiments show significant reduction of the original spaces dimensionality and models achieve good quality of prediction. |
BibTeX: @article{Isachenko2018PLS, author = {Isachenko, R. V. and Vladimirova, M. R. and Strijov, V. V.}, title = {Dimensionality reduction for time series decoding and forecasting problems}, journal = {DEStech Transactions on Computer Science and Engineering}, year = {2018}, volume = {27349}, pages = {286-296}, url = {/papers/IsachenkoVladimirova2018PLS.pdf}, doi = {10.12783/dtcse/optim2018/27940} } |
Isachenko R.V., Strijov V.V. Quadratic Programming Optimization with Feature Selection for Non-linear Models // Lobachevskii Journal of Mathematics, 2018, 39(9) : 1179-1187. Article |
Abstract: To optimize the model parameters the Newton method is widely used. This method is second order optimization procedure that is unstable in real applications. In this paper we propose the procedure to make the optimization process robust. The idea is to select the set of model parameters which have to be optimized in the current step of optimization procedure. We show that in the case of nonlinear regression and logistic regression models the parameters selection could be performed by Quadratic Programming Feature Selection algorithm. It allows to find the set of independent parameters that are responsible for the residuals. We carried out the experiment to show how the proposed method works and compare it with other methods. The paper proposes the robust second-order optimization algorithm. The algorithm based on the iterative Newton method, which is unstable procedure. The authors suggest to select the set of active parameters in each optimization step. The algorithm updates only parameters from this active set. Quadratic programming feature selection is used to find the active set. It maximizes the relevance of model parameters to the residuals and minimizes the redundancy. Nonlinear regression and logistic regression models are investigated. The proposed algorithm achieves the less error with comparison to the other methods. |
BibTeX: @article{Isachenko2018QPFSNonlin, author = {Isachenko, R. V. and Strijov, V. V.}, title = {Quadratic Programming Optimization with Feature Selection for Non-linear Models}, journal = {Lobachevskii Journal of Mathematics}, year = {2018}, volume = {39(9)}, pages = {1179-1187}, url = {https://rdcu.be/bfR32}, doi = {10.1134/S199508021809010X} } |
Motrenko A.P., Strijov V.V. Multi-way feature selection for ECoG-based brain-computer interface // Expert Systems with Applications, 2018, 114(30) : 402-413. Article |
Abstract: The paper addresses the problem of designing Brain-Computer Interfaces. We solve the problem of feature selection in regression models in application to ECoG-based motion decoding. The task is to predict hand trajectories from the voltage time series of cortical activity. Feature description of a each point resides in spatial-temporal-frequency domain and include the voltage time series themselves and their spectral characteristics. Feature selection is crucial for adequate solution of this regression problem, since electrocorticographic data is highly dimensional and the measurements are correlated both in time and space domains. We propose a multi-way formulation of quadratic programming feature selection (QPFS), a recent approach to filtering-based feature selection proposed by Katrutsa and Strijov, �Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria�. QPFS incorporates both estimates of similarity between features, and their relevance to the regression problem, and allows an effective way to leverage them by solving a quadratic program. Our modification allows to apply this approach to multi-way data. We show that this modification improves prediction quality of resultant models. |
BibTeX: @article{Motrenko2018ECoG, author = {Motrenko, A. P. and Strijov, V. V.}, title = {Multi-way feature selection for ECoG-based brain-computer interface}, journal = {Expert Systems with Applications}, year = {2018}, volume = {114(30)}, pages = {402-413}, url = {/papers/MotrenkoStrijov2017ECoG_HL_2.pdf}, doi = {10.1016/j.eswa.2018.06.054} } |
Uvarov N.D., Malkova A.S., Kuznetsov M.P., Rudakov K.V., Strijov V.V. Selection of superposition of models for railway freight forecasting // Moscow University Computational Mathematics and Cybernetics, 2018, 42(4) : 186-193. Article |
Abstract: Our aim is to construct an optimal superposition of models for the short-term railway traffic forecasting. The historical data constitutes daily railway traffic volume between pairs of stations for different cargo types. The given time series are highly volatile, noisy, and non-stationary. We propose a system that finds an optimal superposition of forecasting models with respect to historical data features. Among the candidate models the system considers: moving average model, exponential and kernel smoothing models, ARIMA model, Croston's method and LSTM neural networks. |
BibTeX: @article{Uvarov2018Superpositions, author = {N. D. Uvarov and A. S. Malkova and M. P. Kuznetsov and K. V. Rudakov and V. V. Strijov}, title = {Selection of superposition of models for railway freight forecasting}, journal = {Moscow University Computational Mathematics and Cybernetics}, year = {2018}, volume = {42}, number = {4}, pages = {186-193}, url = {/papers/Uvarov2018SuperpositionForecasting_eng.pdf}, doi = {10.3103/S027864191804009X} } |
Aduenko A.A., Vasileisky A.S., Karelov A.I., Reyer I.A., Rudakov K.V., Strijov V.V. Detection of persistent scatterer pairs on satellite radar images with use of surface relief data // Journal of Information Technologies and Computing Systems, 2018, 68(2) : 29-43. Article Rus |
Abstract: An effective control of geodynamic processes using multiple radar satellite survey and differential interferometric processing of received data requires the identification of terrain areas that preserve an acceptable level of coherence on radar images over a long period. Analysis of the phase component of the images for such areas, called persistent scatterers, makes it possible to estimate the values of small displacements of the observed surface with velocities less than several centimeters per year. In this paper, two radar differential interferometry methods based on the identification of persistent scatterers are considered: the standard method of persistent scatterers and the proposed modification of the method based on the use of persistent scatterer pairs. For both methods it is suggested not to perform a direct phase unwrapping, which is most difficult when most known methods are used. For the method of persistent scatterer pairs it is suggested to apply the quadratic penalty not for the phase unwrapping, but at the final processing stage to recover the absolute values of displacements and corrections of an a priori elevation model from the obtained relative values. The application of the algorithms considered is illustrated by the processing of an interferometric series of 35 radar images obtained by the COSMO-SkyMed system. |
BibTeX: @article{Aduedko2018PSP, author = {Aduenko, A. A.. and Vasileisky, A. S. and Karelov, A. I. and Reyer, I. A. and Rudakov, K. V. and Strijov, V. V.}, title = {Detection of persistent scatterer pairs on satellite radar images with use of surface relief data}, journal = {Journal of Information Technologies and Computing Systems}, year = {2018}, volume = {68(2)}, pages = {29-43}, url = {/papers/Aduenko2017SAR.pdf}, doi = {10.14357/20718632180203} } |
Smerdov A.N., Bakhteev O.Y., Strijov V.V. Optimal recurrent neural network selection for paraphrase detection // Informatics and Applications, 2018, 12(4) : 63-69. Article Rus |
Abstract: The paper investigates the problem of optimal recurrent neural network selection. The lower bound of the model evidence is the selection criterion. The study is concentrated on variational approach to approximate the posterior distribution of the model parameters. The normal distribution of parameters is approximated with various types of the covariance matrix. To boost the model evidence, the authors propose a method for removing parameters with the highest probability density at zero. As an illustrative example, the problem of multi-class classification on a sample of pairs of similar and dissimilar SemEval 2015 offers is considered. |
BibTeX: @article{Smerdov2017Paraphrase, author = {Smerdov, A. N. and Bakhteev, O. Y. and Strijov, V. V.}, title = {Optimal recurrent neural network selection for paraphrase detection}, journal = {Informatics and Applications}, year = {2018}, volume = {12}, number = {4}, pages = {63-69}, url = {/papers/SmerdovBakhteev2017Paraphrase.pdf}, doi = {10.14357/19922264180409} } |
Zamkovoy A.A., Kudiyarov S.P., Martyshkin R.V., Strijov V.V. Harmonization of historical data and expert models for forecasting demand fot rail transportation // Vestnik Universiteta SUM, 2018, 4 : 51-60. Article Rus |
Abstract: The article attempts to solve a problem of rail freight traffic volume forecasts using retrospective data, analysis of the impact of external factors on the cargo base and the distribution of goods shipments by transport mode. In order to improve the forecast fidelity proposed a model integrating historical data of freight rail traffic volume and expert assessments of external factors affecting the work of rail transport. The article describes the structure of historical data, time series of freight traffic volumes, as well as relationship with expert models. |
BibTeX: @article{strijov2018RZD, author = {�. �. Zamkovoy and S. P. Kudiyarov and R. V. Martyshkin and V. V. Strijov}, title = {Harmonization of historical data and expert models for forecasting demand fot rail transportation}, journal = {Vestnik Universiteta SUM}, year = {2018}, volume = {4}, pages = {51-60}, url = {https://vestnik.guu.ru/jour/article/view/996?locale=ru_RU}, doi = {10.26425/1816-4277-2018-4-51-60} } |
Usmanova K.R., Kudiyarov S.P., Martyshkin R.V., Zamkovoy A.A., Strijov V.V. Analysis of relationships between indicators in forecasting cargo transportation // Systems and Means of Informatics, 2018, 28(3) : 6-103. Article Rus |
Abstract: In this paper, we analyze relationship and conformity between indicators in control system, monitoring of state and accounting of railway cargo transporta- tion. Macroeconomic time series that contain control actions, system state and target criteria are considered. We suppose that control actions, state and goal- setting are statistically related. Granger causality test is used to establishing a relationship between time series. It is assumed, that pair of time series are related if the use of the history of one of the series improves the quality of the forecast of the other. The main goal of this analysis is improving the quality of cargo transportation forecast. The computational experiment is carried out on data about cargo transportation, control actions and set target criteria. |
BibTeX: @article{Usmanova2018TimeSeriesCorrelation, author = {K. R. Usmanova and S. P. Kudiyarov and R. V. Martyshkin and A. A. Zamkovoy and V. V. Strijov}, title = {Analysis of relationships between indicators in forecasting cargo transportation}, journal = {Systems and Means of Informatics}, year = {2018}, volume = {28}, number = {3}, pages = {6-103}, url = {/papers/Usmanova2018TimeSeriesCorrelation.pdf}, doi = {10.14357/08696527180307} } |
2017Cinar Y.G., Mirisaee H., Goswami P., Gaussier E., Ait-Bachir A., Strijov V.V. Time series forecasting using RNNs: an extended attention mechanism to model periods and handle missing values // ICONIP 2017, 2017. Article |
Abstract: In this paper, we study the use of recurrent neural networks (RNNs) for modeling and forecasting time series. We first illustrate the fact that standard sequence-to-sequence RNNs neither capture well periods in time series nor handle well missing values, even though many real life times series are periodic and contain missing values. We then propose an extended attention mechanism that can be deployed on top of any RNN and that is designed to capture periods and make the RNN more robust to missing values. We show the effectiveness of this novel model through extensive experiments with multiple univariate and multivariate datasets. |
BibTeX: @article{Cinar2017TimeSeries, author = {Yagmur G. Cinar and Hamid Mirisaee and Parantapa Goswami and Eric Gaussier and Ali Ait-Bachir and Vadim V. Strijov}, title = {Time series forecasting using RNNs: an extended attention mechanism to model periods and handle missing values}, journal = {ICONIP 2017}, year = {2017}, url = {https://arxiv.org/pdf/1703.10089.pdf} } |
Katrutsa A.M., Strijov V.V. Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria // Expert Systems with Applications, 2017, 76 : 1-11. Article |
Abstract: This paper provides a new approach to feature selection based on the concept of feature filters, so that feature selection is independent of the prediction model. Data fitting is stated as a single-objective optimization problem, where the objective function indicates the error of approximating the target vector as some function of given features. Linear dependence between features induces the multicollinearity problem and leads to instability of the model and redundancy of the feature set. This paper introduces a feature selection method based on quadratic programming. This approach takes into account the mutual dependence of the features and the target vector, and selects features according to relevance and similarity measures defined according to the specific problem. The main idea is to minimize mutual dependence and maximize approximation quality by varying a binary vector that indicates the presence of features. The selected model is less redundant and more stable. To evaluate the quality of the proposed feature selection method and compare it with others, we use several criteria to measure instability and redundancy. In our experiments, we compare the proposed approach with several other feature selection methods, and show that the quadratic programming approach gives superior results according to the criteria considered for the test and real data sets. |
BibTeX: @article{Katrutsa2016QPFeatureSelection, author = {Katrutsa, A. M. and Strijov, V. V.}, title = {Comprehensive study of feature selection methods to solve multicollinearity problem according to evaluation criteria}, journal = {Expert Systems with Applications}, year = {2017}, volume = {76}, pages = {1-11}, url = {/papers/Katrutsa2016QPFeatureSelection.pdf}, doi = {10.1016/j.eswa.2017.01.048} } |
Kulunchakov A.S., Strijov V.V. Generation of simple structured Information Retrieval functions by genetic algorithm without stagnation // Expert Systems with Applications, 2017, 85 : 221-230. Article |
Abstract: This paper investigates an approach to construct new ranking models for Information Retrieval. The IR ranking model depends on the document description. It includes the term frequency and document frequency. The model ranks documents upon a user request. The quality of the model is defined by the difference between the documents, which experts assess as relative to the request, and the ranked ones. To boost the model quality a modified genetic algorithm was developed. It generates models as superpositions of primitive functions and selects the best according to the quality criterion. The main impact of the research if the new technique to avoid stagnation and to control structural complexity of the consequently generated models. To solve problems of stagnation and complexity, a new criterion of model selection was introduced. It uses structural metric and penalty functions, which are defined in space of generated superpositions. To show that the newly discovered models outperform the other state-of-the-art IR scoring models the authors perform a computational experiment on TREC datasets. It shows that the resulted algorithm is significantly faster than the exhaustive one. It constructs better ranking models according to the MAP criterion. The obtained models are much simpler than the models, which were constructed with alternative approaches. The proposed technique is significant for developing the information retrieval systems based on expert assessments of the query-document relevance. |
BibTeX: @article{Kulunchakov2016IRfunc, author = {Kulunchakov, A. S. and Strijov, V. V.}, title = {Generation of simple structured Information Retrieval functions by genetic algorithm without stagnation}, journal = {Expert Systems with Applications}, year = {2017}, volume = {85}, pages = {221-230}, url = {/papers/Kulunchakov2014RankingBySimpleFun.pdf}, doi = {10.1016/j.eswa.2017.05.019} } |
Rudakov K.V., Kuznetsov M.P., Motrenko A.P., Stenina M.M., Kashirin D.O., Strijov V.V. Selecting an optimal model for forecasting the volumes of railway goods transportation // Automation and Remote Control, 2017, 78(1) : 75-87. Article |
Abstract: Consideration was given to selection of an optimal model of short-term forecasting of the volumes of railway transport from the historical and exogenous time series. The historical data carry information about the transportation volumes of various goods between pairs of stations. It was assumed that the result of selecting an optimal model depends on the level of aggregation in the types of goods, departure and destination points, and time. Considered were the models of vector autoregression, integrated model of the autoregressive moving average, and a nonparametric model of histogram forecasting. Criteria for comparison of the forecasts on the basis of distances between the errors of model forecasts were proposed. They are used to analyze the models with the aim of determining the admissible requests for forecast, the actual forecast depth included. |
BibTeX: @article{Rudakov2015RZD, author = {Rudakov, K. V. and Kuznetsov, M. P. and Motrenko, A. P. and Stenina, M. M. and Kashirin, D. O. and Strijov, V. V.}, title = {Selecting an optimal model for forecasting the volumes of railway goods transportation}, journal = {Automation and Remote Control}, year = {2017}, volume = {78(1)}, pages = {75-87}, url = {/papers/Rudakov2015RZD.pdf}, doi = {10.1134/S0005117917010064} } |
Bochkarev I.L., Sofronov I.L., Strijov V.V. Generation of expertly-interpreted models for prediction of core permeability // Systems and Means of Informatics, 2017, 27(3) : 74-87. Article Rus |
Abstract: This article is devoted to prediction of core permeability. Permeability is one of the main properties for estimation of filtration of gas and liquid in core. To build a permeability model, porosity, density, depth of measurement, and other core physical properties are used. An algorithm for choosing the optimal prediction model is proposed. The model of superpositions of expertly-defined functions is suggested. The proposed method is a superposition of previously obtained optimal expetly-defined functions and a two-layer neural network. The experiment on core analysis, aero- and hydrodynamics datasets was conducted. During the experiment, the optimal expertly-interpreted models for all datasets were derived. The suggested approach is compared to other methods for choosing models, such as Lasso regression, support vector regression (SVR), gradient boosting, and neural network. The error and optimal parameters estimation was conducted using cross-validation. The experiment showed that the proposed approach is competitive with other state-of-the-art methods. Moreover, the number of neurons is significantly reduced with the use of superpositions of expertly-defined functions. |
BibTeX: @article{Bochkarev2017PermeabilityEstimation, author = {Bochkarev, I.L. and Sofronov, I. L. and Strijov, V. V.}, title = {Generation of expertly-interpreted models for prediction of core permeability}, journal = {Systems and Means of Informatics}, year = {2017}, volume = {27(3)}, pages = {74-87}, url = {/papers/Bochkarev2017PermeabilityEstimation.pdf}, doi = {http://www.ipiran.ru/journal_system/article/08696527170307.html} } |
Molybog I.O., Motrenko A.P., Strijov V.V. Improving classification quality for intrinsic plagiarism problem // Informatics and Applications, 2017, 11(3) : 59-71. Article Rus |
Abstract: The paper addresses the classification problem in multidimensional spaces. The authors propose a supervised modification of t-distributed Stochastic Neighbor Embedding algorithm. Additional features of the proposed modification are that, unlike the original algorithm, it does not require retraining if new data is added to the training set and can be easily parallelized. The novel method was applied to detect intrinsic plagiarism in a collection of documents. The authors also test the performance of their algorithm using synthetic data and show that the quality of classification is higher with thealgorithm than without or with other algorithms for dimension reduction. |
BibTeX: @article{MolybogMotrenko2017DimRed, author = {Molybog, I. O. and Motrenko, A. P. and Strijov, V. V.}, title = {Improving classification quality for intrinsic plagiarism problem}, journal = {Informatics and Applications}, year = {2017}, volume = {11(3)}, pages = {59-71}, url = {/papers/MolybogMotrenko2017DimRed.pdf}, doi = {http://www.ipiran.ru/journal/issues/article/19922264170307.html} } |
Aduenko A.A. Model selection in classification problems (PhD thesis supervised by V.V. Strijov). Moscow Institute of Physics and Technology, 2017. PhdThesis Rus |
Abstract: The problem of constructing multimodels in the classification problem is investigated. The task of classification is basic in machine learning, and the problems of multiclass classification can be effectively reduced to solving one or more problems of two-class classification. The tasks of the two-class classification are the problem of determining the presence of the disease in the patient according to the set of his analyzes, the task of analyzing the texts to get the mood of the messages and the task of credit scoring. These tasks are relevant in connection with the spread of remote diagnostics, automatic decision-making systems. Logistic regression, which is the standard in credit scoring, and other generalized linear models do not allow to take into account the heterogeneity in the data, in particular the dependence of the importance of the feature on the object, and therefore are not optimal in its presence. To take into account inhomogeneities in the data, classifier compositions are used. The methods for constructing the model composition allow one to take into account the inhomogeneity in the data by constructing a multimodel containing several single models. Models in the multi-model can be close or coincident, which leads to uninterpretability and a decrease in the quality of the forecast. In the work offer heuristics for thinning the ensemble of models in the bagging. Genetic algorithms are used to select a subset of models in keying. In the works, clustering models and choosing a single representative for each cluster are used. The papers offer a greedy strategy of gradually increasing the number of classifiers in bagging. To control the number of models, use a priori sparse distribution of the weights of the models in the mixture. The structure of the mixture is sought by maximizing the validity. However, these methods of thinning mixtures do not take into account the proximity between models, and therefore the multimodel can still contain close models. To obtain statistically distinguishable models in multimodels, an external thinning procedure is used, based on a statistical comparison of models by calculating distances between a posteriori parameter distributions for different models, for example, using Bregman divergences or f-divergences. In this paper it is shown that the existing similarity measures distinguish the noninformative model and the coincident informative one, and therefore do not allow to build an adequate multimodel. To solve this problem, a similarity function is proposed that allows solving the problem of statistical differentiation of models. The proposed approach allows to take into account heterogeneities in the data, to obtain an adequate multimodel containing fewer models and having a better quality of classification. The presence of redundant or multicorrelated features affects not only the quality of the classification of the constructed model, but also its stability. To solve the task of selecting characteristics in this paper, the Bayesian approach uses the principle of maximum validity for determining the structure of models. To solve the problem of multicollinearity of attributes, a set of non-multicollinear features is constructed by optimizing the quality criterion. In this paper it is shown that the approach associated with the selection of characteristics is not optimal. It is proved that the method of maximum validity does not allow to take into account the dependencies between the signs, since the estimate of the maximum validity for the covariance matrix of characteristic weights is asymptotically degenerate. For optimal accounting of information from multicollinear features, it is suggested that they be combined. |
BibTeX: @phdthesis{Aduenko2017ModelSelection, author = {Aduenko, A. A.}, title = {Model selection in classification problems (PhD thesis supervised by V.V. Strijov)}, school = {Moscow Institute of Physics and Technology}, year = {2017}, url = {http://www.frccsc.ru/sites/default/files/docs/ds/002-073-05/diss/11-aduenko/11-Aduenko_main.pdf?626} } |
Kuzmin A.A. Hierarchical classication of document collection (PhD thesis supervised by V.V. Strijov). Moscow Institute of Physics and Technology, 2017. PhdThesis Rus |
Abstract: This work investigates methods of text documents categorization and classification. These methods automatically structure documents as hierarchical themes. Also, they optimize existing themes and reveal thematic inconsistencies. |
BibTeX: @phdthesis{Kuzmin2017HierarchicalClustering, author = {Kuzmin, A. A.}, title = {Hierarchical classication of document collection (PhD thesis supervised by V.V. Strijov)}, school = {Moscow Institute of Physics and Technology}, year = {2017}, url = {http://www.frccsc.ru/sites/default/files/docs/ds/002-073-05/diss/08-kuzmin/008-kuzmin_main-txt.pdf?809} } |
2016Kuznetsov M.P., Tokmakova A.A., Strijov V.V. Analytic and stochastic methods of structure parameter estimation // Informatica, 2016, 27(3) : 607-624. Article |
Abstract: The paper presents analytic and stochastic methods of structure parameters estimation for model selection. Structure parameters are covariance matrices of parameters of linear and non-linear regression models. To optimize the model parameters and the structure parameters we maximize the model evidence including the data likelihood and the prior parameter distribution. The analytic methods are based on the approximated model evidence derivatives computation. The stochastic methods are based on the model parameters sampling and data cross-validation. The proposed methods are tested and compared on synthetic and real data. |
BibTeX: @article{Kuznetsov2013Structure, author = {Kuznetsov, M. P. and Tokmakova, A. A. and Strijov, V. V.}, title = {Analytic and stochastic methods of structure parameter estimation}, journal = {Informatica}, year = {2016}, volume = {27(3)}, pages = {607-624}, url = {/papers/HyperOptimizationEng.pdf}, doi = {10.15388/Informatica.2016.102} } |
Kuznetsov M.P., Motrenko A.P., Kuznetsova M.V., Strijov V.V. Methods for intrinsic plagiarism detection and author diarization // Working Notes of CLEF, 2016, 1609 : 912-919. Article |
Abstract: The paper investigates methods for intrinsic plagiarism detection and author diarization. We developed a plagiarism detection method based on constructing an author style function from features of text sentences and detecting outliers. We adapted the method for the diarization problem by segmenting author style statistics on text parts, which correspond to different authors. Both methods were tested on the PAN-2011 collection for the intrinsic plagiarism detection and implemented for the PAN-2016 competition on author diarization. |
BibTeX: @article{Kuznetsov2016CLEF, author = {Kuznetsov, M. P. and Motrenko, A. P. and Kuznetsova, M. V. and Strijov, V. V.}, title = {Methods for intrinsic plagiarism detection and author diarization}, journal = {Working Notes of CLEF}, year = {2016}, volume = {1609}, pages = {912-919}, url = {http://ceur-ws.org/Vol-1609/16090912.pdf}, doi = {http://ceur-ws.org/Vol-1609/} } |
Motrenko A.P., Rudakov K.V., Strijov V.V. Combining endogenous and exogenous variables in a special case of non-parametric time series forecasting model // Moscow University Computational Mathematics and Cybernetics, 2016, 40(2) : 71-78. Article |
Abstract: We address a problem of increasing quality of forecasting time series by taking into account the information about exogenous factors. Our aim is to improve a special case of non-parametric forecasting algorithm, namely the hist algorithm, derived from quantile regression. The hist minimizes the convolution of a histogram of time series with the loss function. To include exogenous factors into this model we suggest to correct the histogram of endogenous time series, using exogenous time series. We propose to adjust the histogram, using mixtures of conditional histograms as a less sparse alternative to multidimensional histogram and in some cases demonstrate the decrease of loss compared to the basic forecasting algorithm. To the extent of our knowledge, such approach to combining endogenous and exogenous time series is original and has not been proposed yet. The suggested method is illustrated with the data from the Russian Railways. |
BibTeX: @article{Motrenko2015ExogenousFactors, author = {Motrenko, A. P. and Rudakov, K. V. and Strijov, V. V.}, title = {Combining endogenous and exogenous variables in a special case of non-parametric time series forecasting model}, journal = {Moscow University Computational Mathematics and Cybernetics}, year = {2016}, volume = {40(2)}, pages = {71-78}, url = {/papers/Motrenko2015ExogenousFactors.pdf}, doi = {10.3103/S0278641916020072} } |
Motrenko A.P., Strijov V.V. Extracting fundamental periods to segment human motion time series // IEEE Journal of Biomedical and Health Informatics, 2016, 20(6) : 1466 - 1476. Article |
Abstract: The paper addresses a problem of sensor-based time series segmentation as a part of human activity recognition problem. We assume that each studied time series contains a fundamenta periodic which can be seen as an ultimate entity (cycle) of motion. Due to the nature of the data and the urge to obtain interpretable results of segmentation, we defne the segmentation as a partition of the time series into the periods of this fundamental periodic. To split the time series into periods we select a pair of principal components of the Hankel matrix. We then cut the trajectory of the selected principal components by its symmetry axis, thus obtaining half-periods that are merged into segments. A method of selecting a pair of components, corresponding to the fundamental periodic is proposed. |
BibTeX: @article{Motrenko2015Fundamental, author = {Motrenko, A. P. and Strijov, V. V.}, title = {Extracting fundamental periods to segment human motion time series}, journal = {IEEE Journal of Biomedical and Health Informatics}, year = {2016}, volume = {20(6)}, pages = {1466 - 1476}, url = {/papers/MotrenkoStrijov2014RV2.pdf}, doi = {10.1109/JBHI.2015.2466440} } |
Bakhteev O.Y., Popova M.S., Strijov V.V. Systems and means of deep learning for classification problems // Systems and Means of Informatics, 2016, 26(2) : 4-22. Article Rus |
Abstract: The paper provides a guidance on deep learning net construction and optimization using GPU. The paper proposes to use GPU-instances on the cloud platform Amazon Web Services. The problem of time series classification is considered. The paper proposes to use a deep learning net, i.e. a multilevel superposition of models, belonging to the following classes: Restricted Boltzman Machines, autoencoders and neural nets with softmax-function in output. The proposed method was tested on a dataset containing time segments from mobile phone accelerometer. The analysis of relation between classification error, dataset size and superposition parameter amount is conducted. |
BibTeX: @article{Bakhteev2016AWS, author = {Bakhteev, O. Yu. and Popova, M. S. and Strijov, V. V.}, title = {Systems and means of deep learning for classification problems}, journal = {Systems and Means of Informatics}, year = {2016}, volume = {26(2)}, pages = {4-22}, url = {/papers/Bakhteev2016AWS.pdf}, doi = {10.14357/08696527160201} } |
Goncharov A.V., Strijov V.V. Metric time series classification using weighted dynamic warping relative to centroids of classes // Informatics and Applications, 2016, 10(2) : 36-47. Article Rus |
Abstract: This paper discusses a problem of metric time series analysis and classification. The proposed classification model uses the matrix of distances between time series which is built with fixed distance function. The dimension of this distance matrix is very high and all related calculations are time-consuming. The problem of reducing the computational complexity is solved by selection reference objects and using them for describing classes. Model that uses dynamic time warping for building reference objects or centroids is chosen as a basic model. This paper introduces a function of weights for each centroid that influence on calculating the distance measure. Time series of different analytic functions and time series of human activity from accelerometer of mobile phone are used as the objects for classification. Properties and classification result of this model are investigated and compared with properties of basic model. |
BibTeX: @article{Goncharov2015autumn, author = {Goncharov, A. V. and Strijov, V. V.}, title = {Metric time series classification using weighted dynamic warping relative to centroids of classes}, journal = {Informatics and Applications}, year = {2016}, volume = {10(2)}, pages = {36-47}, url = {/papers/Goncharov2015authumn.pdf}, doi = {10.14357/19922264160204} } |
Isachenko R.V., Strijov V.V. Metric learning in multiclass time series classification problem // Informatics and Applications, 2016, 10(2) : 48-57. Article Rus |
Abstract: This paper is devoted to the problem of multiclass time series classification. It is proposed to align time series in relation to class centroids. Building of centroids and alignment of time series is carried out by the dynamic time warping algorithm. The accuracy of classification depends significantly on the metric used to compute distances between time series. The distance metric learning approach is used to improve classification accuracy. Themetric learning proceduremodifies distances between objects to make objects fromthe same cluster closer and from the different clusters more distant. The distance between time series is measured by the Mahalanobis metric. The distance metric learning procedure finds the optimal transformation matrix for the Mahalanobis metric. To calculate quality of classification, a computational experiment on synthetic data and real data of human activity recognition was carried out. |
BibTeX: @article{Isachenko2016MetricsLearning, author = {Isachenko, R. V. and Strijov, V. V.}, title = {Metric learning in multiclass time series classification problem}, journal = {Informatics and Applications}, year = {2016}, volume = {10(2)}, pages = {48-57}, url = {/papers/Isachenko2016MetricsLearning.pdf}, doi = {10.14357/19922264160205} } |
Karasikov M.E., Strijov V.V. Feature-based time-series classification // Informatics and Applications, 2016, 10(4) : 121-131. Article Rus |
Abstract: The paper if devoted to multi-class time-series classification problem. Feature- based approach that uses meaningful and concise representations for feature space con- struction is applied. A time-series is considered as a sequence of segments, approximated by parametric models and their parameters are used as time-series features. This fea- ture construction method inherits from approximation model such unique properties as shift invariance. We propose an approach to solve time-series classification problem using distributions of parameters of approximation model. The proposed approach is applied to human activity classification problem. The computational experiments on real data demonstrate superiority of proposed algorithm over baseline solutions. |
BibTeX: @article{Karasikov2016TSC, author = {Karasikov, M. E. and Strijov, V. V.}, title = {Feature-based time-series classification}, journal = {Informatics and Applications}, year = {2016}, volume = {10(4)}, pages = {121-131}, url = {/papers/Karasikov2016TSC.pdf}, doi = {10.14357/19922264160413} } |
Kuznetsova M.V., Strijov V.V. Local forecasting of time series with invariant transformations // Information Technologies, 2016, 22(6) : 457-462. Article Rus |
Abstract: The paper describes a univariate time series forecasting model. It proposes to find segments of local history, which are similar to the forecasted segment. A distance function is used to cluster segments. The forecast is the average of the value of time series from this cluster. To improve the quality of forecast the paper proposes an invariant transformation of segments. This transformation holds the equivalence of time series respect to clusters. The transformation is a function, constructed by the dynamic time warping procedure. The retrospective forecasting procedure calculates the accuracy of the forecasting model. Accelerometer time series of a person�s motion are used in computational experiment. It compares two constructing forecasting models. The first one clusters segments, the second one uses k-nearest neighbor algorithm to select similar segments. |
BibTeX: @article{Kuznetsova2015TimeSeries, author = {Kuznetsova, M. V. and Strijov, V. V.}, title = {Local forecasting of time series with invariant transformations}, journal = {Information Technologies}, year = {2016}, volume = {22(6)}, pages = {457-462}, url = {/papers/Kuznetsova2015TimeSeries.pdf} } |
Neychev R.G., Katrutsa A.M., Strijov V.V. Robust selection of multicollinear features in forecasting // Factory Laboratory, 2016, 82(3) : 68-74. Article Rus |
Abstract: This paper considers a problem of constructing a stable forecasting model using feature selection methods. It proposes a multicollinearity detection criterion, which is necessary in the case of excessive number of features. To investigate properties of this criterion, a theorem is stated. It develops the Belsley method. The proposed criterion runs an algorithm to exclude correlated features, reduce dimensionality of the feature space and to obtain robust estimations of the model parameters. The algorithm adds and removes features consequently according to this criterion. The LAD-Lasso algorithm was chosen as the basic to compare with. The computational experiment investigates an hourly-price forecasting curve problem with the proposed and the basic algorithms. The experiment carried out using time series of the German electricity prices. |
BibTeX: @article{Neychev2015FeatureSelection, author = {Neychev, R. G. and Katrutsa, A. M. and Strijov, V. V.}, title = {Robust selection of multicollinear features in forecasting}, journal = {Factory Laboratory}, year = {2016}, volume = {82(3)}, pages = {68-74}, url = {/papers/Neychev2015FeatureSelection.pdf} } |
Zadayanchuk A.I., Popova M.C., Strijov V.V. Selection of optimal physical activity classification model using measurements of accelerometer // Information Technologies, 2016, 22(4) : 313-318. Article Rus |
Abstract: This paper solves the problem of selecting optimal stable models for classification of physical activity. We select optimal models from the class of two-layer artificial neural networks. There are three different ways to change structure of neurons: network pruning, network growing, and their combination. We construct models by removing its neurons. Neural networks with insufficient or excess number of neurons have insufficient generalization ability and can make unstable predictions. Proposed genetic algorithm optimizes the neural network structure. The novelty of the work lies in the fact that the probability of removing neurons is determined by the variance of parameters. In the computing experiment, models are generated by optimization two quality criteria � accuracy and stability. |
BibTeX: @article{Zadayanchuk2015OptimalNN4, author = {Zadayanchuk, A. I. and Popova, M. C. and Strijov, V. V.}, title = {Selection of optimal physical activity classification model using measurements of accelerometer}, journal = {Information Technologies}, year = {2016}, volume = {22(4)}, pages = {313-318}, url = {/papers/Zadayanchuk2015OptimalNN4.pdf} } |
Zhuravlev Y.I., Rudakov K.V., Korchagin A.D., Kuznetsov M.P., Motrenko A.P., Stenina M.M., Strijov V.V. Methods for hierarchical time series forecasting // Notices of the Russian Academy of Sciences, 2016, 86. 2. . 138 : 138. Article Rus |
Abstract: The papers investigates problems of planning in railway freight transportation under conditions of non-stationary, non-uniform and noisy data. To boost quality of planning it proposes to create an intelligent system, which is based on mathematical models, historical data and expert estimations. The paper describes a project on forecasting system to plan the railway freight transportations following analysis of dependence the freight transportation demand on exogenous factors. |
BibTeX: @article{Zhur2016TimeSeries, author = {Zhuravlev, Yu. I. and Rudakov, K. V. and Korchagin, A. D. and Kuznetsov, M. P. and Motrenko, A. P. and Stenina, M. M. and Strijov, V. V.}, title = {Methods for hierarchical time series forecasting}, journal = {Notices of the Russian Academy of Sciences}, year = {2016}, volume = {86. 2. . 138}, pages = {138}, url = {/papers/Zhuravlev2015RZD.pdf}, doi = {10.7868/S0869587316020213} } |
Goncharov A.V., Strijov V.V. Continuous time series alignment in human actions recognition // Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference proceedings // AINL FRUCT: Artificial Intelligence and Natural Language Conference, 2016 : 83-86. InProceedings |
Abstract: Human physical activity monitoring with wearable devices imposes significant restrictions on the processing power and the amount of memory available to the algorithm. Proposed to move from discrete time series representation to its analytical description and analyze them using mathematical models for satisfying these constraints. The work deals with physical activity classification. It uses metric classification algorithm, where the object�s class determined by the distance from this object to the nearest centroid. Paper proposed to approximate all time series with splines and find the distance to the nearest centroid using continuous alignment path. The calculation of distance is performed using analytical transformations. |
BibTeX: @inproceedings{Gonchariv2016Fruct, author = {Goncharov, A. V. and Strijov, V. V.}, title = {Continuous time series alignment in human actions recognition}, booktitle = {AINL FRUCT: Artificial Intelligence and Natural Language Conference}, journal = {Artificial Intelligence and Natural Language and Information Extraction, Social Media and Web Search FRUCT Conference proceedings}, year = {2016}, pages = {83-86}, url = {/papers/Goncharov_Fruct_2016.pdf} } |
Kuzmin A.A., Aduenko A.A., Strijov V.V. Hierarchical thematic modeling of short text collection // Intelligent Data Processing, Conference Proceedings, 2016 : 174-175. InProceedings |
Abstract: The aim of this study is to construct and verify a hierarchical thematic model of a short text collection. The present authors consider the ways for metrics learning and features selection. Agglomerative and divisive methods to construct a hierarchical model are compared. A hierarchical weighted similarity function is suggested for unlabeled data classification. Weights in this function are the importance values of the terms from the collection dictionary. Entropy-based approach is used to estimate these weights according to the expert model. The proposed similarity function is represented as four-level neural network to consider vector representation of the words given by a trained language model. The proposed methods are used to construct an expert system that helps experts to classify unlabeled abstracts of the major conference EURO. The parameters of this model are estimated using expert models of EURO conference from 2006 till 2016. The results are compared with hierarchical multiclass SVM, probabilistic thematic model SuhiPLSA, and hierarchical naive Bayes approach. |
BibTeX: @inproceedings{Kuzmin2016IDP, author = {Kuzmin, A. A. and Aduenko, A. A. and Strijov, V. V.}, title = {Hierarchical thematic modeling of short text collection}, booktitle = {Intelligent Data Processing, Conference Proceedings}, year = {2016}, pages = {174-175}, url = {/papers/Kuzmin_Modeling_ShortText2016.pdf} } |
Kuzmin A.A., Aduenko A.A., Strijov V.V. Thematic Classification for EURO/IFORS Conference Using Expert Model // 28th European Conference on Operational Research, 2016. InProceedings |
Abstract: Every year the program committee of a major conference constructs its scientific program. Some participants take part in invited sessions, but for the majority of participants the PC along with experts have to choose sessions according to their contributed abstracts. To fit an abstract into the current conference programme one has to construct an expert system. It should respect previous conferences structure and use thematic modeling techniques. The conference structure represents a tree. It has abstracts as leaves and areas, streams, sessions as nodes. Abstracts from the previous conferences already have their positions in this structure. To classify a new abstract one can use divisive hierarchical classification methods, based on SVM, NB or kNN. However, these methods are greedy. Insufficient number of abstracts in each lowest level cluster makes classification unstable. In addition, expert and algorithmic classifications differs. So a group of the most relevant clusters is preferable than the best one to meet expert needs. We propose a relevance operator that returns all clusters sorted by their relevance. We consider three ways of constructing such operator using hierarchical multiclass SVM, PLSA with Adaptive Regularization, and proposed weighted hierarchical similarity function. We construct a model of EURO 2010 using expert models of EURO 2012 and 2013 to demonstrate performance of proposed methods. |
BibTeX: @inproceedings{KuzminEURO2016, author = {Kuzmin, A. A. and Aduenko, A. A. and Strijov, V. V.}, title = {Thematic Classification for EURO/IFORS Conference Using Expert Model}, booktitle = {28th European Conference on Operational Research}, year = {2016}, url = {/papers/KuzminEURO2016.pdf} } |
Motrenko A.P., Neychev R.G., Isachenko R.V., Popova M.S., Gromov A.N., Strijov V.V. Feature generation for multiscale time series forecasting // Intelligent Data Processing, Conference Proceedings, 2016 : 129-130. InProceedings |
Abstract: The paper presents a framework for the massive multiscale time series forecast. The focus is on the problem of forecasting behavior of a device within the concept of Internet of things. The device is monitored by a set of sensors, which produces large amount of multiscale time series during its lifespan. These time series have various time scales since distinct sensors produce observations with various frequencies from milliseconds to weeks. The main goal is to predict the observations of a device in a given time range. The authors propose a method of constructing efficient feature description for the corresponding regression problem. The method involves feature generation and dimensionality reduction procedures. Generated features include historical information about the target time series as well as other available time series, local transformations, and multiscale features. Several forecasting algorithms have been applied to the resulting regression problem and the quality of the forecasts has been investigated for various horizon values. |
BibTeX: @inproceedings{MotrenkoMiltiscale2016IDP, author = {Motrenko, A. P. and Neychev, R. G. and Isachenko, R. V. and Popova, M. S. and Gromov, A. N. and Strijov, V. V.}, title = {Feature generation for multiscale time series forecasting}, booktitle = {Intelligent Data Processing, Conference Proceedings}, year = {2016}, pages = {129-130}, url = {https://sourceforge.net/p/mvr/code/HEAD/tree/lectures/DataFest/Strijov2016FeatureGeneration.pdf?format=raw} } |
Neychev R.G., Motrenko A.P., Isachenko R.V., Inyakin A.S., Strijov V.V. Multimodel forecasting multiscale time series in Internet of things // Intelligent Data Processing, 2016 : 130-131. InProceedings |
Abstract: The paper presents an approach to forecasting multiple intercorrelated time series that can be generated by different sensors of devices within a concept of Internet of things. In this case, generated data are not independent and identically-distributed and there feature space has a complex structure. The forecast construction is considered as regression problem. To solve it, the authors propose mixture of experts approach where several forecasting models are used. Neural networks are chosen as the forecasting models. The optimal structure of neural networks, their parameters, and quantity of experts are analyzed. The proposed method has been tested within computational experiment where it was compared to gradient boosting and decision tree methods. The experiment was conducted on real data containing information about electricity consumption and weather conditions in Poland. |
BibTeX: @inproceedings{Neychev2016IDP, author = {Neychev, R. G. and Motrenko, A. P. and Isachenko, R. V. and Inyakin, A. S. and Strijov, V. V.}, title = {Multimodel forecasting multiscale time series in Internet of things}, booktitle = {Intelligent Data Processing}, year = {2016}, pages = {130-131}, url = {http://www.machinelearning.ru/wiki/images/9/94/NeychevIDP11.pdf} } |
Strijov V.V., Motrenko A.P. Large-scale time series forecasting // 28th European Conference on Operational Research // 28th European Conference on Operational Research, 2016. InProceedings |
Abstract: The talk is devoted to investigation of behavior of a device, a member of the internet of things. A device is monitored by a set of sensors, which produces large amount of multiscale time series during its lifespan. These time series have various time scales, due to measurements could perform over each millisecond, day, week, etc. The main goal is to forecast the next state of a device. The investigation assumes the following conditions for a single device unit time series: there are large set of multiscale time series; the sampling rate of a time series is fixed; each time series has its own forecast horizon. To make an adequate forecasting model hold the following hypothesis: the time history is sufficient long; the time series have auto- and cross-correlation dependencies. The model is static, so there exists a history of optimal size. Each time series could be interpolated by some local model, a that there exist a local approximation model, which could be applied in the case of local data absence. The vector-autoregression approach conducts problem statement. To find a model of optimal complexity a consequent model generation-selection procedure was constructed. The test-bench compares random forest, boosting and mixture of experts. |
BibTeX: @inproceedings{Strijov2016MultiscaleForecasting, author = {Strijov, V. V. and Motrenko, A. P.}, title = {Large-scale time series forecasting}, booktitle = {28th European Conference on Operational Research}, journal = {28th European Conference on Operational Research}, year = {2016}, url = {/papers/Strijov2016MultiscaleForecasting.pdf} } |
Vladimirova M.R., Strijov V.V. Bagging of neural networks in multitask classification of biological acivity for nuclear receptors // Intelligent Data Processing, Conference Proceedings, 2016 : 18-19. InProceedings |
Abstract: The paper is devoted to the multitask classification problem. The main purpose is building an adequate model to predict whether the object belongs to a particular class, precisely, whether the ligand binds to a specific nuclear receptor. Nuclear receptors are a class of proteins found within cells. These receptors work with other proteins to regulate the expression of specific genes, thereby controlling the development, homeostasis, and metabolism of the organism. The regulation of gene expression generally only happens when a ligand a molecule that affects the receptor�s behavior binds to a nuclear receptor. Two-layer neural network is used as a classification model. The paper considers the problems of linear and logistic regressions with squared and cross-entropy loss functions. To analyze the classification result, the authors propose to decompose the error into bias and variance terms. To improve the quality of classification by reducing the error variance, the authors suggest the composition of neural networks bagging. Bagging generates a set of subsamples from the training sample using the bootstrap procedure. All subsamples have the same size as initial sample. Classifiers are trained on each subsample separately. Then their individual predictions are aggregated by voting. The proposed method improves the quality of investigated sample classification. |
BibTeX: @inproceedings{Vladimorove2016IDP, author = {Vladimirova, M. R. and Strijov, V. V.}, title = {Bagging of neural networks in multitask classification of biological acivity for nuclear receptors}, booktitle = {Intelligent Data Processing, Conference Proceedings}, year = {2016}, pages = {18-19}, url = {http://www.machinelearning.ru/wiki/images/5/5f/VladimirovaIOI2016_eng.pdf} } |
Kuznetsov M.P. Construction preference learning models using ordinal-scaled expert estimations (PhD thesis supervised by V.V. Strijov). Moscow Institute of Physics and Technology, 2016. PhdThesis Rus |
Abstract: The thesis is devoted to preference learning models. The proposed methods involve rank-scales expert estimations as object features. |
BibTeX: @phdthesis{Kuznetsov2016PhDThesis, author = {Kuznetsov, M. P.}, title = {Construction preference learning models using ordinal-scaled expert estimations (PhD thesis supervised by V.V. Strijov)}, school = {Moscow Institute of Physics and Technology}, year = {2016}, url = {https://mipt.ru/upload/iblock/782/kuznetsov_dissertatsiya.pdf}, doi = {https://mipt.ru/upload/iblock/3cb/kuznetsov_avtoreferat.pdf} } |
2015Ignatov A.D., Strijov V.V. Human activity recognition using quasiperiodic time series collected from a single triaxial accelerometer // Multimedia Tools and Applications, 2015, 17.05.2015 : 1-14. Article |
Abstract: The current generation of portable mobile devices incorporates various types of sensors that open up new areas for the analysis of human behavior. In this paper, we propose a method for human physical activity recognition using time series, collected from a single tri-axial accelerometer of a smartphone. Primarily, the method solves a problem of time series segmentation, assuming that each meaningful segment corresponds to one fundamental period of motion. To extract the fundamental period we construct the phase trajectory matrix, applying the technique of principal component analysis. The obtained segments refer to various types of human physical activity. To recognize these activities we use the k-nearest neighbor algorithm and neural network as an alternative. We verify the accuracy of the proposed algorithms by testing them on the WISDM dataset of labeled accelerometer time series from thirteen users. The results show that our method achieves high precision, ensuring nearly 96% recognition accuracy when using the bunch of segmentation and k-nearest neighbor algorithms. |
BibTeX: @article{Ignatov2015HumanActivity, author = {Ignatov, Andrey D. and Strijov, Vadim V.}, title = {Human activity recognition using quasiperiodic time series collected from a single triaxial accelerometer}, journal = {Multimedia Tools and Applications}, year = {2015}, volume = {17.05.2015}, pages = {1-14}, url = {https://rdcu.be/6oBD}, doi = {10.1007/s11042-015-2643-0} } |
Katrutsa A.M., Strijov V.V. Stresstest procedure for feature selection algorithms // Chemometrics and Intelligent Laboratory Systems, 2015, 142 : 172-183. Article |
Abstract: This study investigates the multicollinearity problem and the performance of feature selection methods in case of datasets have multicollinear features. We propose a stresstest procedure for a set of feature selection methods. This procedure generates test data sets with various configurations of the target vector and features. A number of some multicollinear features are inserted in every configuration. A feature selection method results a set of selected features for given test data set. To compare given feature selection methods the procedure uses several quality measures. A criterion of the selected features redundancy is proposed. This criterion estimates number of multicollinear features among the selected ones. To detect multicollinearity it uses the eigensystem of the parameter covariance matrix. In computational experiments we consider the following illustrative methods: Lasso, ElasticNet, LARS, Ridge and Stepwise and determine the best one, which solve the multicollinearity problem for every considered configuration of dataset. |
BibTeX: @article{Katrutsa2015Stresstest, author = {Katrutsa, A. M. and Strijov, V. V.}, title = {Stresstest procedure for feature selection algorithms}, journal = {Chemometrics and Intelligent Laboratory Systems}, year = {2015}, volume = {142}, pages = {172-183}, url = {/papers/Katrutsa2014TestGenerationEn.pdf}, doi = {10.1016/j.chemolab.2015.01.018} } |
Stenina M.M., Kuznetsov M.P., Strijov V.V. Ordinal classification using Pareto fronts // Expert Systems with Applications, 2015, 42(14) : 5947 5953. Article |
Abstract: We solve an instance ranking problem using ordinal scaled expert estimations. The experts define a preference binary relation on the set of features. The instance ranking problem is considered as the monotone multiclass classification problem. To solve the problem we use a set of Pareto optimal fronts. The proposed method is illustrated with the problem of categorization of the IUCN Red List threatened species. |
BibTeX: @article{Medvednikova2014POF, author = {Stenina, M. M. and Kuznetsov, M. P. and Strijov, V. V.}, title = {Ordinal classification using Pareto fronts}, journal = {Expert Systems with Applications}, year = {2015}, volume = {42(14)}, pages = {5947 5953}, url = {/papers/Medvednikova2014POF.pdf}, doi = {10.1016/j.eswa.2015.03.021} } |
Rudakov K.V., Sanduleanu L.N., Tokmakova A.A., Yamschikov I.S., Reyer I.A., Strijov V.V. Terrain objects movement detection using SAR interferometry // Computer Research and Modeling, 2015, 7(5) : 1047-1060. Article |
Abstract: To determine movements of infrastructure objects on Earth surface, SAR interferometry is used. The method is based on obtaining a series of detailed satellite images of the same Earth surface area at different times. Each image consists of the amplitude and phase components. To determine terrain movements the change of the phase component is used. A method of persistent scatterers detection and estimation of relative shift of objects corresponding to persistent scatterers is suggested. |
BibTeX: @article{Sanduleanu2016SAR, author = {Rudakov, K. V. and Sanduleanu, L. N. and Tokmakova, A. A. and Yamschikov, I. S. and Reyer, I. A. and Strijov, V. V.}, title = {Terrain objects movement detection using SAR interferometry}, journal = {Computer Research and Modeling}, year = {2015}, volume = {7(5)}, pages = {1047-1060}, url = {/papers/Rudakov_crm_2015.pdf}, doi = {http://crm.ics.org.ru/journal/article/2370/} } |
Kuznetsov M.P., Clausel M., Amini M.-R., Gaussier E., Strijov V.V. Supervised topic classification for modeling a hierarchical conference structure // in S. Arik et al. (Eds.): International conference on neural information processing, Part 1, LNCS, 2015, 9489 : 90�97. Article |
Abstract: In this paper we investigate the problem of supervised latent modelling for extracting topic hierarchies from data. The supervised part is given in the form of expert information over document-topic correspondence. To exploit the expert information we use a regularization term that penalizes the dierence between a predicted and an expertgiven model. We hence add the regularization term to the log-likelihood function and use a stochastic EM based algorithm for parameter estimation. The proposed method is used to construct a topic hierarchy over the proceedings of the European Conference on Operational Research and helps to automatize the abstract submission system. |
BibTeX: @article{TopicModelsICONIP2015, author = {Kuznetsov, M. P. and Clausel, M. and Amini, M.-R. and Gaussier, E. and Strijov, V. V.}, title = {Supervised topic classification for modeling a hierarchical conference structure}, journal = {in S. Arik et al. (Eds.): International conference on neural information processing, Part 1, LNCS}, year = {2015}, volume = {9489}, pages = {90 97}, url = {/papers/TopicModelsICONIP2015.pdf}, doi = {10.1007/978-3-319-26532-2_11} } |
Aduenko A.A., Rudakov K.V., Reyer I.A., Vasileysky A.S., Karelov A., Strijov V.V. Algorithm of detection and registration of persistent scatters on satellite radar images // Computer optics, 2015, 39(4) : 622-630. Article Rus |
Abstract: To detect small movements of Earth surface (with a velocity less than several centimeters per year) with use of SAR-interferometry methods it is necessary to find a number of surface areas remaining coherent on radar images over a long period. These areas and corresponding image points are called persistent scatterers. Two methods of persistent scatterers detection are consid-ered in the paper. The methods are compared by the number of detected points and their average time coherence. The algorithms considered are illustrated with an example of processing of a set containing 35 radar images. |
BibTeX: @article{Aduenko2015SAR_ComOptics.pdf, author = {Aduenko, A. A. and Rudakov, K. V. and Reyer, I. A. and Vasileysky, A. S. and Karelov, A.I. and Strijov, V. V.}, title = {Algorithm of detection and registration of persistent scatters on satellite radar images}, journal = {Computer optics}, year = {2015}, volume = {39(4)}, pages = {622-630}, url = {/papers/Aduenko2015PSdetection.pdf}, doi = {10.18287/0134-2452-2015-39-4-622-630} } |
Gazizullina R.K., Medvednikova M.M., Strijov V.V. Capacity of railway cargo transportation forecasting // Systems and Means of Informatics, 2015, 25(1) : 144-157. Article Rus |
Abstract: The article is devoted to research of the algorithm of nonparametric forecasting of railway cargo transportation capacity. The problem considered is forecasting the number of wagons with various goods, following various routes. Topology of the railway network is given - for all possible pairs of railway lines information about all blocks of wagons, which have moved from one line to another, including the number of wagons in a block, type of cargo and date of a route, is provided. The algorithm, based on convolution of empirical density distribution of values ??of time series with loss function, is used for prediction. Previously forecast was carried out for each railway junction separately. Quality of the forecast is proposed to improve due to prediction by pairs of lines instead of predicting departure of all wagons from the given junction. The algorithm is illustrated by daily data on transportation of 38 types of cargo collected during year and a half. |
BibTeX: @article{Gazizullina2014RailwayForecasting, author = {Gazizullina, R. K. and Medvednikova, M. M. and Strijov, V. V.}, title = {Capacity of railway cargo transportation forecasting}, journal = {Systems and Means of Informatics}, year = {2015}, volume = {25(1)}, pages = {144-157}, url = {/papers/Gazizullina2014RailwayForecasting.pdf}, doi = {10.14357/08696527150109} } |
Goncharov A.V., Popova M.S., Strijov V.V. Metric time series classification using dynamic warping relative to centroids of classes // Systems and Means of Informatics, 2015, 25(4) : 52-64. Article Rus |
Abstract: This paper discusses a problem of time series classification in case of several classes. The proposed classification model uses the matrix of distance between time series. This distance measure is defined by dynamic time warping method. The dimension of the distance matrix is very high. This paper introduces centroids of each class as a reference objects to decrease this dimension. The distance matrix with lower dimension describes the distance between all objects and reference objects. We use this method for human activity recognition and investigate the quality of classification on data from the mobile accelerometer. This metric algorithm of classification is compared with separating classification algorithm. |
BibTeX: @article{Goncharov2015MetricClassification, author = {Goncharov, A. V. and Popova, M. S. and Strijov, V. V.}, title = {Metric time series classification using dynamic warping relative to centroids of classes}, journal = {Systems and Means of Informatics}, year = {2015}, volume = {25(4)}, pages = {52-64}, url = {/papers/Goncharov2015MetricClassification.pdf}, doi = {10.14357/08696527150404} } |
Katrutsa A.M., Strijov V.V. The multicollinearity problem for feature selection methods in regression // Informational Technologies, 2015, 1 : 8-18. Article Rus |
Abstract: The paper investigates the multicollinearity problem in regression analysis and its influence on the performance of feature selection methods. The authors propose a procedure to test feature selection methods. A criteria is proposed to compare the feature selection methods, according to their performance when the multicollinearity is present. The feature selection methods are compared according to the other well-known evaluation measures. Methods to generate data sets of different multicollinearity types were proposed. The authors investigate performance of feature selection methods. The feature selection methods were tested on the data sets of different multicollinearity types. |
BibTeX: @article{Katrutsa2014TestGeneration, author = {A. M. Katrutsa and V. V. Strijov}, title = {The multicollinearity problem for feature selection methods in regression}, journal = {Informational Technologies}, year = {2015}, volume = {1}, pages = {8-18}, url = {/papers/Katrutsa2014TestGeneration.pdf} } |
Popova M.S., Strijov V.V. Selection of optimal physical activity classification model using measurements of accelerometer // Informatics and applications, 2015, 9(1) : 76-86. Article Rus |
Abstract: In this paper we solve the problem of selecting optimal stable models for classification of physical activity. Each type of physical activity of a particular person is described by a set of features generated from the accelerometer time series. In conditions of feature�s multicollinearity selection of stable models is hampered by the need to evaluate a large number of parameters of these models. Evaluation of optimal parameter values is also difficult due to the fact that the error function has a large number of local minima in the parameter space. In the paper we choose the optimal models from the class of two-layer artificial neural networks. We solve the problem of finding the Pareto optimal front of the set of models. The paper presents a stepwise strategy of building optimal stable models. The strategy includes steps of deleting and adding parameters, criteria of pruning and growing the model and criteria of breaking the process of building. The computational experiment compares models generated by the proposed strategy on three quality criteria --- complexity, accuracy and stability. |
BibTeX: @article{Popova2014OptimalModelSelection, author = {Maria S. Popova and Vadim V. Strijov}, title = {Selection of optimal physical activity classification model using measurements of accelerometer}, journal = {Informatics and applications}, year = {2015}, volume = {9(1)}, pages = {76-86}, url = {/papers/Popova2014OptimalModelSelection.pdf}, doi = {10.14357/19922264150107} } |
Popova M.S., Strijov V.V. Building superposition of deep learning neural networks for solving the problem of time series classication // Systems and Means of Informatics, 2015, 25(3) : 60-77. Article Rus |
Abstract: This paper solves the problem of time-series classication using deep learning neural networks. The paper proposes to use a multilevel superposition of models belonging to the following classes of neural networks: two-layer neural networks, Boltzmann machines and autoencoders. Lower levels of superposition extract from noisy data of high dimensionality informative features, while the upper level of the superposition solves the problem of classication based on these extracted features. The proposed model has been tested on two samples of physical activity time series. The classication results obtained by proposed model in computational experiment were compared with the results which were obtained on the same datasets by foreign authors. The study showed the possibility of using deep learning neural networks for solving problems of time-series physical activity classication. |
BibTeX: @article{PopovaStrijov2015DeepLearning, author = {Popova, M. S. and Strijov, V. V.}, title = {Building superposition of deep learning neural networks for solving the problem of time series classication}, journal = {Systems and Means of Informatics}, year = {2015}, volume = {25(3)}, pages = {60-77}, url = {/papers/PopovaStrijov2015DeepLearning.pdf}, doi = {10.14357/08696527150304} } |
Stenina M.M., Strijov V.V. Forecasts reconciliation for hierarchical time series forecasting problem // Informatics and applications, 2015, 9(2) : 77-89. Article Rus |
Abstract: The hierarchical time series forecasting problem is researched. Time series forecasts must satisfy the physical constraints and the hierarchical structure. In this paper a new algorithm for hierarchical time series forecasts reconciliation is proposed. The algorithm is called GTOp (Game-theoretically Optimal reconciliation). It guarantees that reconciled forecasts quality is not worse than self-dependent forecasts one. This approach is based on Nash equilibrium search for the antagonistic game and turn forecasts reconciliation problem into the optimization problem with equality and inequality constraints. It is proved that the Nash equilibrium in pure strategies exists in the game if some assumptions about the hierarchical structure, the physical constraints and the loss function are satisfied. The algorithm performance is demonstrated for different types of hierarchical structures of time series. |
BibTeX: @article{Stenina2014Reconciliation.pdf, author = {Stenina, M. M. and Strijov, V. V.}, title = {Forecasts reconciliation for hierarchical time series forecasting problem}, journal = {Informatics and applications}, year = {2015}, volume = {9(2)}, pages = {77-89}, url = {/papers/Stenina2014Reconciliation.pdf}, doi = {10.14357/19922264150209} } |
Strijov V.V., Weber G.W., Weber R., Sureyya O.A. Editorial of the special issue data analysis and intelligent optimization with applications // Machine Learning, 2015, 101(1-3) : 1-4. Article Eng http://link.springer.com/article/10.1007/s10994-015-5523-y |
Abstract: This special issue on �Data Analysis and Intelligent Optimization with Applications� follows a previous special issue of this journal on the interplay of Machine Learning and Optimization, �Model Selection and Optimization in ML� (Machine Learning 85:1-2, October 2011). This time we shift our focus to applications of data analysis and optimization techniques. Optimization problems underlie most machine learning approaches. Due to emergence of new practical applications, new problems and challenges for traditional approaches arise. Emergent applications generate new data analysis problems, which, in turn boost new research in optimization. The contribution of machine learning researchers into the field of optimization is of considerable significance and should not be overlooked. This special issue collected solutions, adapted for real world problems, leading to massive and large-scale data sets, online data and imbalanced data. We encouraged submission of papers, devoted to combining machine learning and data analysis techniques with advances in optimization to produce methods of Intelligent Optimization, both theoretical and practical. Our goal for this special issue was to bring together researchers working in different areas, related to analytics and optimization. |
BibTeX: @article{Strijov2015Editorial, author = {Strijov, V. V. and Weber, G. W. and Weber, R. and Sureyya, O. A.}, title = {Editorial of the special issue data analysis and intelligent optimization with applications}, journal = {Machine Learning}, year = {2015}, volume = {101(1-3)}, pages = {1-4}, url = {http://link.springer.com/content/pdf/10.1007%2Fs10994-015-5523-y.pdf}, doi = {10.1007/s10994-015-5523-y} } |
Katrutsa A.M., Kuznetsov M.P., Rudakov K.V., Strijov V.V. Metric concentration search procedure using reduced matrix of pairwise distances // Intelligent Data Analysis, 2015, 19(5) : 1091-1108. Article Eng http://content.iospress.com/articles/intelligent-data-analysis/ida760 |
Abstract: This paper presents a new fast clustering algorithm RhoNet, based on the metric concenration location procedure. To locate the metric concentration, the algorithm uses a reduced matrix of pairwise ranks distances. The key feature of the proposed algorithm is that it doesn't need the exhaustive matrix of pairwise distances. This feature reduces computational complexity. It is designed to solve the protein secondary structure recognition problem. The computational experiment collects tests and to hold performance analysis and analysis of dependency for the algorithm quality and structure parameters. The algorithm is compared with k-modes and tested on different metrics and data sets. |
BibTeX: @article{Katrutsa2014RhoNet, author = {Katrutsa, A. M. and Kuznetsov, M. P. and Rudakov, K. V. and Strijov, V. V.}, title = {Metric concentration search procedure using reduced matrix of pairwise distances}, journal = {Intelligent Data Analysis}, year = {2015}, volume = {19(5)}, pages = {1091-1108}, url = {/papers/Katrutsa2014RhoNetClustering.pdf}, doi = {10.3233/IDA-150760} } |
2014Kuznetsov M.P., Strijov V.V. Methods of expert estimations concordance for integral quality estimation // Expert Systems with Applications, 2014, 41(4-2) : 1988-1996. Article |
Abstract: The paper presents new methods of alternatives ranking using expert estimations and measured data. The methods use expert estimations of objects quality and criteria weights. This expert estimations are changed during the computation. The expert estimation are supposed to be measured in linear and ordinal scales. Each object is described by the set of linear, ordinal or nominal criteria. The constructed object estimations must not contradict both the measured criteria and the expert estimations. The paper presents methods of expert estimations concordance. The expert can correct result of this concordance. |
BibTeX: @article{KuznetsovStrijov2014MethodsExpert, author = {M. P. Kuznetsov and V. V. Strijov}, title = {Methods of expert estimations concordance for integral quality estimation}, journal = {Expert Systems with Applications}, year = {2014}, volume = {41(4-2)}, pages = {1988-1996}, url = {/papers/Kuznetsov-Strijov2013Concordance.pdf}, doi = {10.1016/j.eswa.2013.08.095} } |
Motrenko A.P., Strijov V.V., Weber G.-W. Bayesian sample size estimation for logistic regression // Journal of Computational and Applied Mathematics, 2014, 255 : 743-752. Article |
Abstract: The problem of sample size estimation is important in the medical applications, especially in the cases of expensive measurements of immune biomarkers. The papers describes the problem of logistic regression analysis including model feature selection and includes the sample size determination algorithms, namely methods of univariate statistics, logistics regression, cross-validation and Bayesian inference. The authors, treating the regression model parameters as the multivariate variable, propose to estimate sample size using the distance between parameter distribution functions on cross-validated data sets. |
BibTeX: @article{Motrenko2013Bayesian, author = {Anastasiya P. Motrenko and Vadim V. Strijov and Gerhard-Wilhelm Weber}, title = {Bayesian sample size estimation for logistic regression}, journal = {Journal of Computational and Applied Mathematics}, year = {2014}, volume = {255}, pages = {743-752}, url = {/papers/MotrenkoStrijovWeber2012SampleSize.pdf}, doi = {10.1016/j.cam.2013.06.031} } |
Aduenko A.A., Strijov V.V. Joint feature and object selection in multiclass classification of documents // Infocommunication Technologies, 2014, 1 : 47-54. Article Rus |
Abstract: The article is dedicated to the problem of search engine results ranking. The algorithm of multiclass classifi cation with joint selection of features and objects is proposed. It is modifi ed for interclass relevance comparison. Features and objects selection is performed with stepwise regression and with genetic algorithm. Results obtained using both algorithms are compared. Proposed multiclass classifi cation algorithm is tested on synthetic data and on data of Yandex search engine results. |
BibTeX: @article{Aduenko2013Multiclass, author = {A. A. Aduenko and V. V. Strijov}, title = {Joint feature and object selection in multiclass classification of documents}, journal = {Infocommunication Technologies}, year = {2014}, volume = {1}, pages = {47-54}, url = {/papers/Aduenko2013Multiclass.pdf} } |
Kuzmin A.A., Aduenko A.A., Strijov V.V. Thematic classification using expert model for major conference abstracts // Information Technologies, 2014, 6 : 22-26. Article Rus |
Abstract: The aim of this paper is to verify a thematic structure of the conference abstracts collection. The conference consists of main Areas; each main Area consists of Streams; each Stream contains Sessions; Session consists of several talks. This conference structure determines a thematic model of the conference. Thousands of scientists submit their abstracts and participate in the a major conference, and the its thematic model of such conference has a multilevel structure. The program committee constructs an expert thematic model of the conference every year. Due to the huge number of experts in program committee, they meet the problem of thematic integrity verification occurs. The aim of this paper is to find inconsistences in the expert thematic model using the a text clustering approach. We consider an abstracts collection with an given expert model. The base assumption is that the terms of the abstract determine the theme of this abstract and its position location in the thematic model. We propose the a similarity function of two abstracts and . The introduce a quality function, which determines the quality of the thematic model. It considering involves the intracluster and intercluster similarities. The proposed fast non-metric clustering algorithm maximizes the this quality function. To make the some constructed model similar with the given expert model, the algorithm modity doesn�t change a the constructed model if the increase of the quality function exceeds is less than a some set fixed value of the threshold parameter value. This threshold impacts on the number of revealed inconsistences in the expert model. The proposed method constructs a thematic model for the abstracts for EURO 2013. |
BibTeX: @article{Kuzmin2014Thematic, author = {A. A. Kuzmin and A. A. Aduenko and V. V. Strijov}, title = {Thematic classification using expert model for major conference abstracts}, journal = {Information Technologies}, year = {2014}, volume = {6}, pages = {22-26}, url = {/papers/Kuzmin2014Thematic.pdf} } |
Motrenko A., Strijov V.V. Obtaining an aggregated forecast of railway freight transportation using Kullback-Leibler distance // Informatics and applications, 2014, 8(2) : 86-97. Article Rus |
Abstract: This study addresses the problem of obtaining an aggregated forecast of railway freight transportation. To improve the quality of aggregated forecast, we solve a time series clusterization problem, such that the time series in each cluster belong to the seme distribution. Solving the clusterization problem, we need to estimate the distance between empirical distributions of the time series. We introduce a two-sample test based on the Kullback-Leibler distance between histograms of the time series. We provide theoretical and experimental research of the suggested test. Also, as a demonstration, the clusterization of a set of railway time series based on the Kullback�Leibler distance between time series is obtained. |
BibTeX: @article{Motrenko2014KullbackLeibler, author = {A.P. Motrenko and V. V. Strijov}, title = {Obtaining an aggregated forecast of railway freight transportation using Kullback-Leibler distance}, journal = {Informatics and applications}, year = {2014}, volume = {8(2)}, pages = {86-97}, url = {/papers/MotrenkoStrijov2014KL.pdf} } |
Stenina M.M., Strijov V.V. Reconciliation of aggregated and disaggregated time series forecasts in nonparametric forecasting problems // Systems and Means of Informatics, 2014, 24(2) : 21-34. Article Rus |
Abstract: In many applications there are problems of forecasting a lot of time series with hierarchical structure. It is needed to reconcile forecasts across the hierarchy. In this paper new algorithm of reconciliation hierarchical time series forecasts is proposed. This algorithm is based on solving of optimization problem with constraints. Proposed algorithm allows to reconcile forecasts with nonplanar hierarchical structure and take into account physical constraints of forecasted values such as non-negativeness or maximal value. The algorithm performance is illustrated by railroad stations occupancy data in Omsk region. Forecasts quality is compared with forecasts quality optimal algorithm of reconciliation. Also the algorithm performance is demonstrated for nonplanar hierarchical structure of time series. |
BibTeX: @article{Stenina2014RailRoadsMatching, author = {Stenina, M. M. and Strijov, V. V.}, title = {Reconciliation of aggregated and disaggregated time series forecasts in nonparametric forecasting problems}, journal = {Systems and Means of Informatics}, year = {2014}, volume = {24(2)}, pages = {21-34}, url = {/papers/Stenina2014RailRoadsMatching.pdf}, doi = {0.14357/08696527140202} } |
Varfolomeeva A.A., Strijov V.V. An algorithm for bibliographic records parsing using structure learning methods // Information Technologies, 2014, 7 : 11-15. Article Rus |
Abstract: The paper solves the application problem of structured texts segmentation, namely each segment of a bibliographic record must correspond to its filed type of the BibTeX format and each record must correspond to its bibliographic type. This problem arises due to the existence of different standards for bibliographic records: an algorithm for determining the types of fields of bibliographic records, which is independent of the specific standards of their composition, should be proposed. To solve the problem of determining the field type the method of constructing matrix �objects� and matrices �answers� is proposed. The authors offer an algorithm of a bibliography lists parsing using the structure regression method, and the optimization problem of regression model�s parameters is also solved. According to the results of fields' segmentation bibliographic types of the records are clustered. The quality of the constructed model is investigated using a collection of non-parsed bibliography lists. In the paper it is shown the proposed algorithm has good quality of segmentation and clustering, if it has sufficient training sample. |
BibTeX: @article{VarfolomeevaStrijov2013FeatureSelection, author = {Varfolomeeva, A. A. and Strijov, V. V.}, title = {An algorithm for bibliographic records parsing using structure learning methods}, journal = {Information Technologies}, year = {2014}, volume = {7}, pages = {11-15}, url = {/papers/Varfolomeeva2013StrcLearning.pdf} } |
Aduenko A.A., Strijov V.V. Multimodelling and Object Selection for Banking Credit Scoring // Conference of the International Federation of Operational Research Societies, 2014 : 138. InProceedings |
Abstract: To construct a bank credit scoring model one must select a set of informative objects (client records) to get the unbiased estimation of the model parameters. This set must have no outliers. The authors propose an object selection algorithm for mixture of regression models. It is based on analysis of the covariance matrix for the parameters estimations. The computational experiment shows statistical significance of the classification quality improvement. The algorithm is illustrated with the cash loans and heart disease data sets. |
BibTeX: @inproceedings{Aduenko2014MultomodelingMulticollinear_IFORS, author = {Alexander A. Aduenko and Vadim V. Strijov}, title = {Multimodelling and Object Selection for Banking Credit Scoring}, booktitle = {Conference of the International Federation of Operational Research Societies}, year = {2014}, pages = {138}, url = {/papers/Aduenko2014MultiModel_IFORS.pdf} } |
Katrutsa A.M., Strijov V.V. Multicollinearity: Performance Analysis of Feature Selection Algorithms // Conference of the International Federation of Operational Research Societies, 2014 : 138. InProceedings |
Abstract: We investigate the multicollinearity problem and its influence on the performance of feature selection methods. The paper proposes the testing procedure for feature selection methods. We discuss the criteria for comparing feature selection methods according to their performance when the multicollinearity is present. Feature selection methods are compared according to the other evaluation measures. We propose the method of generating test data sets with different kinds of multicollinearity. Authors conclude about the performance of feature selection methods if the multicollinearity is present. |
BibTeX: @inproceedings{Katrutsa2014MultomodelingMulticollinear_IFORS, author = {Alexandr M. Katrutsa and Vadim V. Strijov}, title = {Multicollinearity: Performance Analysis of Feature Selection Algorithms}, booktitle = {Conference of the International Federation of Operational Research Societies}, year = {2014}, pages = {138}, url = {/papers/Katrutsa2014MultiCollinear_IFORS.pdf} } |
Kuzmin A.A., Aduenko A.A., Strijov V.V. Thematic Classification for EURO/IFORS Conference Using Expert Model // Conference of the International Federation of Operational Research Societies, 2014 : 175. InProceedings |
Abstract: The decision support system predicts the areas, streams and sessions for the abstracts of a major conference. Abstract collections from the previous EURO/IFORS (2010, 2012, 2013) conferences and their expert thematic models are considered. The terminological dictionary of the conference and the global thematic model of these collections are constructed. A similarity function between two abstracts is proposed. The non-metric hierarchical clustering algorithm which considers a constructed global thematic model is used to construct the thematic model of a new conference without an expert model. |
BibTeX: @inproceedings{Kuzmin2014Thematic_INFORS, author = {Arsentii A. Kuzmin and Alexander A. Aduenko and Vadim V. Strijov}, title = {Thematic Classification for EURO/IFORS Conference Using Expert Model}, booktitle = {Conference of the International Federation of Operational Research Societies}, year = {2014}, pages = {175}, url = {/papers/Kuzmin2014Thematic_INFORS.pdf} } |
Kuznetsov M.P., Strijov V.V. Partial Orders Combining for the Object Ranking Problem // Conference of the International Federation of Operational Research Societies, 2014 : 157. InProceedings |
Abstract: We propose a new method for the ordinal-scaled object ranking problem. The method is based on the combining of partial orders corresponding to the ordinal features. Every partial order is described with a positive cone in the object space. We construct the solution of the object ranking problem as the projection to a superposition of the cones. To restrict model complexity and prevent overfitting we reduce dimension of the superposition and select most informative features. The proposed method is illustrated with the problem of the IUCN Red List monotonic categorization. |
BibTeX: @inproceedings{Kuznetsov2014PartialOrders_IFORS, author = {Mikhail P. Kuznetsov and Vadim V. Strijov}, title = {Partial Orders Combining for the Object Ranking Problem}, booktitle = {Conference of the International Federation of Operational Research Societies}, year = {2014}, pages = {157}, url = {/papers/Kuznetsov2014PartialOrder_IFORS.pdf} } |
Matrosov M., Strijov V.V. Short-Term Forecasting of Musical Compositions Using Chord Sequences // Conference of the International Federation of Operational Research Societies, 2014 : 229. InProceedings |
Abstract: The objective is to predict a sequence of chords. It is treated as multivariate time series of discrete values. A chord is represented as an array of half-tone sounds within one octave. We utilize a classifier based on probability distributions over chord sequences that are estimated both on a big training set and some revealed part of the forecasted melody. It shows robust forecasting on a set of 50 000 midi files. The novelty is model selection algorithm and invariant representation of chords. The same technique can be used to predict or synthesize various types of discrete time series. |
BibTeX: @inproceedings{Matrosov2014Musical_IFORS, author = {Mikhail Matrosov and Vadim V. Strijov}, title = {Short-Term Forecasting of Musical Compositions Using Chord Sequences}, booktitle = {Conference of the International Federation of Operational Research Societies}, year = {2014}, pages = {229}, url = {/papers/Matrosov2014Musical_IFORS.pdf} } |
Strijov V.V., Kuznetsov M.P., Motrenko A.P. Structure learning and forecasting model generation // Conference of the International Federation of Operational Research Societies, 2014 : 101. InProceedings |
Abstract: The aim of the study is to suggest a method to forecast a structure of a regression model superposition, which approximates a data set in terms of some quality function. The problem: algorithms of model selection are computationally complex due to the large number of models. The solution: we developed a model structure forecasting algorithm based on previously selected models. |
BibTeX: @inproceedings{Strijov2014Structure_IFORS, author = {V. V. Strijov and M. P. Kuznetsov and A. P. Motrenko}, title = {Structure learning and forecasting model generation}, booktitle = {Conference of the International Federation of Operational Research Societies}, year = {2014}, pages = {101}, url = {/papers/Strijov2014StructLearning_IFORS.pdf} } |
Sologub R.A. Algorithms of inductive model generation and transformation for non-linear regression problems (PhD thesis supervised by V.V. Strijov). Russian Academy of Sciences, Computing Center, 2014. PhdThesis Rus |
Abstract: The thesis provides a solution for the problem of automatic generation and validation the quantitative mathematical models. The considered models are used for describing the results of measurements and experiments. In the thesis we investigate a fundamental problem of automatic model generation for in the data analysis field. The generated models are used for approximation, analysis and forecasting the results of experiments. To generate a model we consider the expert-given requirements on the model structure. This consideration allows us to construct the interpretable models that adequately describe the results of measurements. To construct an adequate model we use expert-given basic functions and a set of generation rules. The model is represented as a superposition of the basic functions. The generation rules define the admissibility of superpositions and exclude the generation of isomorphic models. We propose to develop the existing methods of automatic model generation. In particular, we propose to consider expert requirements to the model structure and to rank the models according to the expert preferences. The proposed methods of the isomorphic superpositions search are based on the isomorphic subgraphs search and on the substitution of graphs. We investigate the methods and algorithms of model generation, their properties, complexity and stability. While solving an applied problem of mathematical modeling, the existing knowledges and expert information about model structure are often insufficient to construct the efficient model. Lack of the independent variables makes the methods of model and feature generation very perspective. The idea of feature generation based on the generation of the new independent variables - images of the original variables over the set of successive mappings. This mappings are called the basic functions. Previously the applied problems were considered in terms of the present approach. The basic functions construction and feature generation approaches were used for the economic and industrial problems. While solving this problems, the researchers didn�t investigate the existence, completeness and correctness of the proposed algorithm. In the thesis we develop the theoretical validation of correctness and admissibility of the superpositions generation methods and the methods convergence. We propose methods of optimization of the model structure. The group method of data handling, an example of the model generation method, was considered by A.G. Ivakhnenko. In the case of linear model the method generates new features using the multiplication operation. Using the Kolmogorov-Gabor polynomials, the algorithm generates the models of different complexity by the set of criteria. As a result, the method finds the model of optimal complexity described by an equation or a system of equations. An important stage of development of regression models was a consideration of non-linear models. This approach is widely described by G. Seber: he considered construction and parameter estimation for the non-linear models. To estimate the parameters, there was propose a Levenberg-Marquardt method. J. Koza and N. Zelinka proposed a symbolic regression technique for inductive model generation. The method found an optimal model from the set of superpositions by the genetic programming. The inductive model generation was used to solve an applied problem of the optimal antenna form determination. V.V. Strijov developed the ideas of the inductive model generation by using the coherent Bayesian inference for the parameter estimation. While analysing the model structure, the most convenient way of the superposition representation is a graph-tree. Thereby the methods of graph transformation are applied to the superpositions. This methods allow us to describe formally the structure optimization procedures. We consider categorial representation of graph transformations and conditions of the rules usability. For the trees transformation we use the elementary patterns of graphs and construct the isomorphic graphs of the more complex structure. |
BibTeX: @phdthesis{Sologub2014PhDThesis, author = {Sologub, R. A.}, title = {Algorithms of inductive model generation and transformation for non-linear regression problems (PhD thesis supervised by V.V. Strijov)}, school = {Russian Academy of Sciences, Computing Center}, year = {2014}, url = {/papers/Sologub2014Disser-0018d.pdf} } |
Strijov V.V. Model genetation and selection for regression and classification problems (DSc Thesis). Russian Academy of Sciences, Computing Center, 2014. PhdThesis Rus |
Abstract: The thesis is devoted to the problem of model selection for regression and classification. According to the proposed approach, the models are selected from the inductively generated set. We propose to analyse the distribution of model parameters to choose the model of optimal complexity. There are two ways to construct the models, describing an observed data: mathematical modelling and data analysis. Models of the first type can be interpreted by the experts in the field of study [Krasnoshchyokov: 2000]. Models of the second type perform more efficiently, but don�t always have a clear interpretation [Bishop: 2006]. An actual problem of theoretical computer science is to combine advantages of the two approaches to obtain efficient interpretable models. The key issue is to construct the adequate regression and classification models for the forecasting problems. The problem is to find the models of optimal complexity describing the data with given accuracy. An additional restriction is an interpretability of the models for the expert in the field of study. The goal of research is to propose and investigate methods of model selection from the inductively generated set. The problem of model selection from the countable successively generated set is novel. To formulate the problem setting we used the broad material in the fields of model and feature selection, that is one of the key problems in the machine learning and data analysis area. The basic problem of study is to develop the methods of the successive models generation and of the parameters distribution estimation. The estimations of parameter covariance matrices are used for simplification the model selection procedure. The key challenge of this problem is the parameters estimation of the big number of structurally complex regression models. Relation between model generation and selection problems was investigated by A.G. Ivakhnenko in the early 1980s. According to the proposed group method of data handling [Ivakhnenko: 1981, Madala: 1994], the model of optimal structure can be found by the successive generation of linear models using the Kolmogorov-Gabor polynomial of the independent variables. The criteria of optimal model structure is given by the cross validation procedure. Unlike the GMDH, the symbolic regression method [Koza: 2005, Zelinka: 2008] generates arbitrary non-linear superpositions of basic functions. In the last years the problem of model complexity analysis for symbolic regression became significant field of study [Hazan: 2006, Vladislavleva: 2009]. Initially the methods of inductive model generation were proposed in terms of the group method of data handling. The structure of superposition was defined by the external quality criteria. Afterwards this criteria were explained in terms of data generation hypothesis and the Bayesian inference. To solve a problem of successive model generation, there arises a problem of estimation of the superposition elements informativity. In terms of the Bayesian regression [Bishop: 2000], to estimate informativity the probability density of model parameters is used. The probability density is a parametric function; its parameters referred to as hyperparameters [Bishop : 2006]. The hyperparameters analysis can be regarded as one of model selection methods. For the modification of the non-linear models superposition there was proposed an optimal brain damage method [LeCun: 1990]. According to this method, an element of the superposition is regarded as non-informative, if the saliency value of an error function doesn�t exceed the given threshold. The model selection problem is one of the key problems of the regression analysis field. One of the present model selection methods is the minimum description length principle. The MDL principle chooses the best compressed efficient model [Grunwald: 2005]. The problem of models comparison is investigated in detail by [MacKay: 1994-2003]. As an alternative to the information criteria [Burnham: 2002, Lehman: 2005], there was proposed a coherent Bayesian inference. The first level estimates the model parameters. The second level makes the hyperparameters adjusting. According to this method, the chance to select more complex model, at the comparable values of the error function, is less. The principles of the Bayesian approach in the linear model case were proposed by the authors [Celeux: 2006, Massart: 2008, Fleury: 2006]. At the same time, the mentioned principles and approaches remain open the questions investigated in the present thesis. By this reason we propose to create and develop the theory of regression model generation and selection. The problem is as follows. The set of models of the given class is inductively generated by the set of parametric basic functions given by the experts. Each model is an admissible superposition of the basic functions. The models interpretability is guaranteed by the expert-given basic functions, that are the basic elements of the model superposition. Each class of models is defined by the rules of superposition generation. The required model accuracy achieved by the consideration of the wideness of the basic models class. The optimum criteria includes the concepts of model complexity and accuracy, as well as the data generation hypothesis. Along with the parameter estimations, the proposed method estimates the model hyperparameters. Using information about the hyperparameters, the method estimates informativity of the superposition elements and optimizes the superposition structure. The optimum criteria, given by the data generation hypothesis, allows to choose the optimal models. Thus, we propose a new approach to the formulated problem. The set of models is generated inductively from the set of basic functions given by the experts. Each model is considered as the admissible superposition of the basic functions. Together with the parameters estimation we propose to estimate the hyperparameters of the parameters distribution. Using the parameter estimations we measure the informativity of the superposition elements and optimize the model structure. We choose the optimal model according to the quality criteria given by the data generation hypothesis. �onstruction of the new methods of model selection for the classification and regression is a major and actual problem of the recognition theory. |
BibTeX: @phdthesis{Strijov2014DScThesis, author = {Strijov, V. V.}, title = {Model genetation and selection for regression and classification problems (DSc Thesis)}, school = {Russian Academy of Sciences, Computing Center}, year = {2014}, url = {/papers/Strijov2015ModelSelectionRu.pdf} } |
2013Tsyganova S.V., Strijov V.V. The construction of hierarchical thematic models for document collection // Applied Informatics, 2013, 1 : 109-115. Article |
Abstract: This work is devoted to detection themes of document collection and to their hierarchical structure. The main task is to construct hierarchical thematic model for documents' collection. To solve this task it's suggested to use probabilistic topic models. The main attention is paid to hierarchical thematic models and, particulary, to discuss the properties of PLSA and LDA algorythms. The peculiarity of construction of hierarchical model is the crossing from the conception of "bag of words" to conception of "bag of themes". The work is illustrate on theses of EURO-2012 conference and on synthetic data. |
BibTeX: @article{TsyganovaStrijov2013Hierarchical, author = {Tsyganova, S. V. and Strijov, V. V.}, title = {The construction of hierarchical thematic models for document collection}, journal = {Applied Informatics}, year = {2013}, volume = {1}, pages = {109-115}, url = {/papers/Tsyganova2013TopicHierarchy.pdf} } |
Aduenko A.A., Strijov V.V. Optimal text placement for titles of documents in collection // Software Engineering, 2013, 3 : 21-25. Article Rus |
Abstract: Consider the method of visualization of the results of thematic clustering of documents� collection. Pairwise-distance matrix is projected on plain using PCA. It is required to place the titles of dociments on plain. The loss function, which allows to reach a minimal overlap, is suggested. For its optimisation BFGS algorithm is used. Method suggested in the article is illustrated by visualization of conference�s thesis. |
BibTeX: @article{Aduenko2013TextVisualizing, author = {A. A. Aduenko and V. V. Strijov}, title = {Optimal text placement for titles of documents in collection}, journal = {Software Engineering}, year = {2013}, volume = {3}, pages = {21-25}, url = {/papers/AduenkoStrijov2013TextVisualizing.pdf} } |
Budnikov E.A., Strijov V.V. Estimating probabilities of text strings in document collections // Information Technologies, 2013, 4 : 40-45. Article Rus |
Abstract: Consider the problem of estimating the probabilities of strings in a document. To solve the problem, the model of n-grams is used. The n-gram classes is proposed to solve the estimation problem the large number of model parameters. Three discount models: Good-Turing, Katz and absolute discounting are used to solve the problem of zero probability of strings. The proposed model is illustrated by computational experiments on real data. |
BibTeX: @article{BudnikovStrijov2013Estimation, author = {E. A. Budnikov and V. V. Strijov}, title = {Estimating probabilities of text strings in document collections}, journal = {Information Technologies}, year = {2013}, volume = {4}, pages = {40-45}, url = {/papers/BudnikovStrijov2013Estimation.pdf} } |
Ivanova A.V., Aduenko A.A., Strijov V.V. Algorithm of construction logical rules for text segmentation // Software Engineering, 2013, 6 : 41-48. Article Rus |
Abstract: Consider the method of recovery of BibTeX-structure bibliographic records from their text representation. Structure is recovered using logical rules defined on an expert-given set of regular expressions. Algorithm based on stub covers is proposed for constructing the logic rules. The algorithm is illustrated with the problem of searching the structure in bibliographic records, represented by text strings. |
BibTeX: @article{IvanovaAduenkoStrijov2013TextMarkUp, author = {A. V. Ivanova and A. A. Aduenko and V. V. Strijov}, title = {Algorithm of construction logical rules for text segmentation}, journal = {Software Engineering}, year = {2013}, volume = {6}, pages = {41-48}, url = {/papers/Ivanova2012LogicStructureCor.pdf} } |
Kuzmin A.A., Strijov V.V. Validation of thematic models for document collections // Software Engineering, 2013, 4 : 16-20. Article Rus |
Abstract: Consider a collection of documents with expert thematic model. To verify the adequacy of the expert model build an algorithmic model by hierarchical clustering text collections. The agglomerative and divisive clustering methods are investigated. The algorithmic model error in comparison to the expert model is estimated. The differences between expert model and algorithmic model are visualized. |
BibTeX: @article{Kuzmin2013ThematicClustering, author = {A. A. Kuzmin and V. V. Strijov}, title = {Validation of thematic models for document collections}, journal = {Software Engineering}, year = {2013}, volume = {4}, pages = {16-20}, url = {/papers/Kuzmin2013ThematicClustering.pdf} } |
Medvednikova M.M., Strijov V.V. Construction of rank-scaled quality integral indicator for scientific publications in using co-clustering // Notices of Tula State University, 2013, 1 : 154-165. Article Rus |
Abstract: The method of the scientific publications quality measurement is proposed. It connects the quality of researcher�s publication and the quality of a journal in which the researcher publishes his article. The joined integral indicator is computed for the list of previous years publications using the collaborative filtering algorithm. A proximity function of authors and journals� integral indicators is proposed as the quality functional. The involvement of the researchers� and publishers� integration into the international science is estimated. |
BibTeX: @article{Medvednikova2013CoIndicator, author = {M. M. Medvednikova and V. V. Strijov}, title = {Construction of rank-scaled quality integral indicator for scientific publications in using co-clustering}, journal = {Notices of Tula State University}, year = {2013}, volume = {1}, pages = {154-165}, url = {/papers/Medvednikova2012CoIndicator.pdf} } |
Rudoy G.I., Strijov V.V. Algorithms for inductive generation of superpositions for approximation of experimental data // Informatics and applications, 2013, 7(1) : 17-26. Article Rus |
Abstract: The paper presents an algorithm which inductively generates admissible non-linear models. An algorithm to generate all admissible superpositions of given complexity in finite number of iterations is proposed. The proof of its correctness is stated. The proposed approach is illustrated by a computational experiment on synthetic data. |
BibTeX: @article{Rudoy2013Generation, author = {Rudoy, Georgiy I. and Strijov, Vadim V.}, title = {Algorithms for inductive generation of superpositions for approximation of experimental data}, journal = {Informatics and applications}, year = {2013}, volume = {7(1)}, pages = {17-26}, url = {/papers/Rudoy2012Generation_Preprint.pdf} } |
Strijov V.V., Krymova E.A., Weber G.W. Evidence optimization for consequently generated models // Mathematical and Computer Modelling, 2013, 57(1-2) : 50-56. Article Rus |
Abstract: To construct an adequate regression model one has to fulfill the set of measured features with their generated derivatives. Often the number of these features exceeds the number of the samples in the data set. After a feature generation process the problem of feature selection from a set of highly correlated features arises. The proposed algorithm uses an evidence maximization procedure to select a model as a subset of generated features. During the selection process it rejects multicollinear features. A problem of European option volatility modeling illustrates the algorithm. Its performance is compared with the performances of similar well-known algorithms. |
BibTeX: @article{Strijov11Evidence, author = {Strijov, V. V. and Krymova, E. A. and Weber, G. W.}, title = {Evidence optimization for consequently generated models}, journal = {Mathematical and Computer Modelling}, year = {2013}, volume = {57(1-2)}, pages = {50-56}, url = {http://www.sciencedirect.com/science/article/pii/S0895717711001075}, doi = {10.1016/j.mcm.2011.02.017} } |
Strijov V.V. Error function in regression analysis // Factory Laboratory, 2013, 79(5) : 65-73. Article Rus |
Abstract:. |
BibTeX: @article{Strijov2013ErrorFunction, author = {Strijov, V. V.}, title = {Error function in regression analysis}, journal = {Factory Laboratory}, year = {2013}, volume = {79(5)}, pages = {65-73}, url = {/papers/Strijov2012ErrorFn.pdf} } |
Zaytsev A.A., Strijov V.V., Tokmakova A.A. Estimation regression model hyperparameters using maximum likelihood // Informational Technologies, 2013, 2 : 11-15. Article Rus |
Abstract: The papers considers the regression model selection problem. The model parameters are supposed to be a multivariate random variable with independently distributed components. A method for hyperparameters optimization is proposed. Direct way to obtain the hyperparameters estimations is shown. The papers illustrated the usage of the hyperparameters in the feature selection problem. The suggested method is compared with the Laplace approximation method. |
BibTeX: @article{Zaitsev2012Estimation, author = {A. A. Zaytsev and V. V. Strijov and A. A. Tokmakova}, title = {Estimation regression model hyperparameters using maximum likelihood}, journal = {Informational Technologies}, year = {2013}, volume = {2}, pages = {11-15}, url = {/papers/ZaytsevStrijovTokmakova2012Likelihood_Preprint.pdf} } |
Aduenko A.A., Kuzmin A.A., Strijov V.V. Hierarchical thematic model visualizing algorithm // 26th European Conference on Operational Research, 2013 : 155. InProceedings |
Abstract: The talk is devoted to the problem of the thematic hierarchical model construction. One must to construct a hierarchcal model of a scientific conference abstracts, to check the adequacy of the expert models and to visualize hierarchical differences between the algorithmic and expert models. An algorithms of hierarchical thematic model constructing is developed. It uses the notion of terminology similarity to construct the model. The obtained model is visualized as the plane graph. |
BibTeX: @inproceedings{KuzminStrijov2013VisualizingEURO, author = {Aduenko, A. A. and Kuzmin, A. A. and Strijov, V. V.}, title = {Hierarchical thematic model visualizing algorithm}, booktitle = {26th European Conference on Operational Research}, year = {2013}, pages = {155} } |
Kuznetsov M.P., Strijov V.V. The IUCN Red List threatened speices categorization algorithm // 26th European Conference on Operational Research, 2013 : 352. InProceedings |
Abstract: The main purpose of the IUCN Red List is to categorize those plants and animals that are facing a high risk of extinction. Species are classified by the IUCN Red List into nine groups ordered by the the relative risk of extinction in the wild nature. Each species is described with the rank-scaled features given by the experts. The problem is to associate each species with one of the groups according to the data given by the experts. We consider the rank-scaled features as the cones in the space of objects and construct the solution as the nearest point to the superposition of this cones. |
BibTeX: @inproceedings{KuznetsovStrijov2013RedListEURO, author = {Kuznetsov, M. P. and Strijov, V. V.}, title = {The IUCN Red List threatened speices categorization algorithm}, booktitle = {26th European Conference on Operational Research}, year = {2013}, pages = {352} } |
Strijov V.V. Credit Scorecard Development: Model Generation and Multimodel Selection // 26th European Conference on Operational Research, 2013 : 220. InProceedings |
Abstract: The talk is devoted to the automatic model generation for application scoring. According to the bank requirements a scorecard consists of a combination of the logistic regression models. We will discuss the following problems: First, how many models we must generate? Second, which model from the generated model set should be used to compute the probability of default for a newcomer client? Third, what features must be selected for the models? These problems must be resolved to develop a precise, stable and simple scorecard. |
BibTeX: @inproceedings{Strijov2013ScorecardEURO, author = {Strijov, V. V.}, title = {Credit Scorecard Development: Model Generation and Multimodel Selection}, booktitle = {26th European Conference on Operational Research}, year = {2013}, pages = {220}, url = {/papers/Strijov2013EUROscoring.pdf} } |
2012Aduenko A.A., Kuzmin A.A., Strijov V.V. Feature selection and metrics optimisation for document collection clustering // Notices of Tula State University, 2012, 3 : 119-131. Article Rus |
Abstract: This paper deals with the problem of verification of correctness of a thematic clustering of texts with the help of metric algorithms. The algorithm of selection the optimal distance function for texts is proposed. Correspondence between received texts� clustering and their expert classification is studied. The results of clusterisation and their correspondence to expert thematic classification are illustrated in the computing experiment on the real text collection. |
BibTeX: @article{AduenkoKuzminStrijov2013Selection, author = {A. A. Aduenko and A. A. Kuzmin and V. V. Strijov}, title = {Feature selection and metrics optimisation for document collection clustering}, journal = {Notices of Tula State University}, year = {2012}, volume = {3}, pages = {119-131}, url = {/papers/Kuzmin2013ThematicClustering.pdf} } |
Kuznetsov M.P., Strijov V.V., Medvednikova M.M. Multiclass classification of objects with the rank-scale description // Notices on Science and Technology of SPb. PSU, 2012, 5 : 92-95. Article Rus |
Abstract: The authors propose a method of an integral indicator construction based on the rank-scaled description matrix given by an expert. The authors propose three-step iterative algorithm to estimate correction parameters and features weights. The feature selection problem is investigated. The method illustrated with the problem of classification of the Red Book of Russian Federation rare species statuses. |
BibTeX: @article{Kuznetsov2012RankScales, author = {Kuznetsov, M. P and Strijov, V. V. and Medvednikova, M. M.}, title = {Multiclass classification of objects with the rank-scale description}, journal = {Notices on Science and Technology of SPb. PSU}, year = {2012}, volume = {5}, pages = {92-95}, url = {/papers/Kuznetsov2012Curvilinear.pdf} } |
Medvednikova M.M., Strijov V.V., Kuznetsov M.P. Algorithm of multiclass monotonous Pareto-classification // Notices of Tula State University, 2012, 3 : 132-141. Article Rus |
Abstract: The authors propose a method to search a monotonous function, which is defined on the cartesian product of the linearly-ordered sets. The method is based on the procedures of monotonization of the discrete-argument function and Pareto-optimal front slicing. The feature selection problem investigated. The problem illustrated with the problem of forecasting of the Red Book of Russian Federation rare-spices statuses. |
BibTeX: @article{Medvednikova2012RankScales, author = {Medvednikova, Mariya M. and Strijov, Vadim V. and Kuznetsov, Mikhail P.}, title = {Algorithm of multiclass monotonous Pareto-classification}, journal = {Notices of Tula State University}, year = {2012}, volume = {3}, pages = {132-141}, url = {/papers/Medvednikova2012RankScales.pdf} } |
Motrenko A.P., Strijov V.V. Multiclass logistic regression for cardio-vascular disease forecasting // Notices of Tula State University, 2012, 1 : 153-162. Article Rus |
Abstract: The paper describes an algorithm to classify four groups of patients: a cardio-vascular disease group, a cardio-risk group and two types of healthy groups. The blood-cells protein measurements are the description features for an investigated patient. The paper develops an algorithm to forecast a patient�s cardio-vascular disease case as one of four unordered classes. The problem is to estimate the regression parameters and select the most informative features for multi-class classification. During the forecasting all pairs of the classes are considered. |
BibTeX: @article{Motrenko2012CVD, author = {A. P. Motrenko and V. V. Strijov}, title = {Multiclass logistic regression for cardio-vascular disease forecasting}, journal = {Notices of Tula State University}, year = {2012}, volume = {1}, pages = {153-162}, url = {/papers/MotrenkoStrijov2012HAPrediction.pdf} } |
Sanduleanu L.N., Strijov V.V. Feature selection for autoregressive forecasting // Informational Technologies, 2012, 6 : 11-15. Article Rus |
Abstract: The authors investigate the optimal model selection problem with application to the auto-regression forecasting. To solve the problem one has to select a maximum well-defined feature subset, subject to some given value of the error function. To select the feature set the modified add-del feature selection algorithm is used. This paper suggests a method of time series forecasting model selection. The computational experiment compares the electricity hourly prices forecasts. |
BibTeX: @article{Sanduleanu2012FeatureSelection_IT, author = {L. N. Sanduleanu and V. V. Strijov}, title = {Feature selection for autoregressive forecasting}, journal = {Informational Technologies}, year = {2012}, volume = {6}, pages = {11-15}, url = {/papers/SanduleanuStrijov2011FeatureSelection_Preprint.pdf} } |
Strijov V.V., Kuznetsov M.P., Rudakov K.V. Rank-scaled metric clustering of amino-acid sequences // Mathematical Biology and Bioinformatics, 2012, 7(1) : 345-359. Article Rus |
Abstract: To solve the problem of the secondary protein structure recognition, an algorithm for amino-acid subsequences clustering is developed. To reviel clusters it uses the pairwise distances between the subsequences. The algorithm does not require the complete pairwise matrix. This main distinction of it implies the reduction of the computational complexity. To run the clustering, it needs no more than the ranks of the distances between subsequences. The algorithm is illustrated using synthetic data along with the amino-acid sequences from the UniProt database. |
BibTeX: @article{Strijov2012Clustering, author = {Strijov, V. V. and Kuznetsov, M. P. and Rudakov, K. V.}, title = {Rank-scaled metric clustering of amino-acid sequences}, journal = {Mathematical Biology and Bioinformatics}, year = {2012}, volume = {7(1)}, pages = {345-359}, url = {/papers/Strijov2012(7_345).pdf} } |
Tokmakova A.A., Strijov V.V. Estimation of linear model hyperparameters for noisy or correlated feature selection problem // Informatics and applications, 2012, 6(4) : 66-75. Article Rus |
Abstract: This paper deals with the problem of feature selection in linear regression models. To select features authors estimate the covariance matrix of the model parameters. Dependent variable and model parameters are assumed to be normally distributed vectors. Laplace approximation is used for estimation of the covariance matrix: logarithm of the error function is approximated by the normal distribution function. The problem of noise or correlated features is also examined, since in this case the model parameters covariance matrix becomes singular. An algorithm for feature selection is proposed. The results of the study for a time series are given in the computational experiment. |
BibTeX: @article{Tokmakova2012Hyper, author = {A. A. Tokmakova and V. V. Strijov}, title = {Estimation of linear model hyperparameters for noisy or correlated feature selection problem}, journal = {Informatics and applications}, year = {2012}, volume = {6(4)}, pages = {66-75}, url = {/papers/Tokmakova2011HyperParJournal_Preprint.pdf} } |
Motrenko A.P., Strijov V.V., Weber G.-W. Bayesian sample size estimation for logistic regression // International Conference on Applied and Computational Mathematics, 2012 : 1-5. InProceedings |
Abstract: The paper is devoted to the logistic regression analysis, applied to classification problems in biomedicine. A group of patients is investigated as a sample set; each patient is described with a set of features, named as biomarkers and is classified into two classes. Since the patient measurement is expensive the problem is to reduce number of measured features in order to increase sample size. The responsive variable is assumed to follow a Bernoulli distribution. Also, parameters of the regression function are evaluated. With given set of features, the model is excessively complex. The problem is to select a set of features of smaller size, that will classify patients effectively. In logistic regression features are usually selected by stepwise regression. In the computational experiment, exhaustive search is implemented. This makes the experts sure that all possible combinations of the features were considered. The authors use the area under ROC curve as the optimum criterion in the feature selection procedure. |
BibTeX: @inproceedings{Motrenko2012Bayesian, author = {Anastasiya P. Motrenko and Vadim V. Strijov and Gerhard-Wilhelm Weber}, title = {Bayesian sample size estimation for logistic regression}, booktitle = {International Conference on Applied and Computational Mathematics}, year = {2012}, pages = {1-5}, url = {/papers/MotrenkoStrijovWeber2012SampleSize_ICACM.pdf} } |
Strijov V.V. Sequental model selection in forecasting // 25th European Conference on Operational Research, 2012 : 176. InProceedings |
Abstract: To forecast financial time series one needs a set of models of optimal structure and complexity. The mixture model selection procedures are based on the coherent Bayesian inference. To estimate the model parameters and covariance matrix, Laplace approximations methods are introduced. Using the covariance matrix one could split up the data set to form mixture of models and select a model with minimum description length. |
BibTeX: @inproceedings{Strijov2012EURO, author = {Vadim V. Strijov}, title = {Sequental model selection in forecasting}, booktitle = {25th European Conference on Operational Research}, year = {2012}, pages = {176}, url = {/papers/Strijov2012EURO.pdf} } |
Kuznetsov M.P., Strijov V.V. Integral indicator construction using rank-scaled design matrix // Intellectual Information Processing. Conference proceedings, 2012 : 130-132. InProceedings Rus |
Abstract: ������ ������ ���������� ������������ ����������� �������� �������� � �������������� ���������� ������ � ���������� ������. ������ ������ ������ ������� ��������� � �������� ������. ������������ ���������� ������ �������� ��������, ������� �������������� � �������� ����������. ��� ������ ���������� � �������� �����. ��������������� ������ ��������� ����� ������������ �����������, ��������� ������������� �� ���������� ������. ��� ����� �� ������� �������� �������� ��������� �������� ������������� ����������. ������������ ��������� ������������ ��������� ���������� ������ �� ��� ���������. |
BibTeX: @inproceedings{Kuznetsov2012IOI, author = {Kuznetsov, M. P. and Strijov, V. V.}, title = {Integral indicator construction using rank-scaled design matrix}, booktitle = {Intellectual Information Processing. Conference proceedings}, year = {2012}, pages = {130--132}, url = {/papers/Kuznetsov2012IOI.pdf} } |
Rudoy G.I., Strijov V.V. Simplification of superpositions of primitive functions with graph rule-rewriting // Intellectual Information Processing. Conference proceedings, 2012 : 140-143. InProceedings Rus |
Abstract: The paper develops a superposition simplification algorithms for nonlinear regression. A superposition represents an acyclic directed graph. To simplify an graph subtree is replaces for an isomorphic one. |
BibTeX: @inproceedings{Rudoy2012IOI, author = {Rudoy, G. I. and Strijov, V. V.}, title = {Simplification of superpositions of primitive functions with graph rule-rewriting}, booktitle = {Intellectual Information Processing. Conference proceedings}, year = {2012}, pages = {140--143}, url = {/papers/Rudoy2012IOI.pdf} } |
Tokmakova A.A., Strijov V.V. Estimation of linear model hyperparametres for noise or correlated feature selection problem // Intellectual Information Processing. Conference proceedings, 2012 : 156-159. InProceedings Rus |
Abstract: This paper deals with the problem of feature selection in the linear regression models. To select features the author estimate the covariance matrix of the model parameters. Dependent variable and model parameters are assumed to be normally distributed. The laplace approximation is used for estimation the covariance matrix: the logarithm error function is approximated by the normal distribution function. In the case of noise and correlated features covariance matrix becomes singular. An algorithm for feature selection is proposed. |
BibTeX: @inproceedings{Tokmakova2012IOI, author = {Tokmakova, A. A. and Strijov, V. V.}, title = {Estimation of linear model hyperparametres for noise or correlated feature selection problem}, booktitle = {Intellectual Information Processing. Conference proceedings}, year = {2012}, pages = {156-159}, url = {/papers/Tokmakova2012IOI.pdf} } |
2011Strijov V.V., Krymova E.A. Model selection in linear regression analysis // Informational Technologies, 2011, 10 : 21-26. Article |
Abstract: To obtain an adequate regression model one often has to enlarge the feature set by generating of derivative features. So the regression problem must be reformulated as the problem of the feature selection. Hereby we assume that the number of features is almost equal of exceeds the number of samples in the data set and present a comparative study of classical and new feature selection algorithms. The study is illustrated by the problem of European option volatility modelling. |
BibTeX: @article{krymova11vybor_it, author = {V. V. Strijov and E. A. Krymova}, title = {Model selection in linear regression analysis}, journal = {Informational Technologies}, year = {2011}, volume = {10}, pages = {21-26}, url = {http://novtex.ru/IT/it2011/number_10_annot.html#5} } |
Strijov V.V., Granic G., Juric J., Jelavic B., Maricic S.A. Integral indicator of ecological impact of the Croatian thermal power plants // Energy, 2011, 36(7) : 4144-4149. Article |
Abstract: The main goal of this paper is to present the methodology of construction of the Integral Indicator for the Croatian Thermal Power Plants and the Combined Heat and Power Plants. The Integral Indicator is intended to compare the Power Plants according to a certain criterion. The criterion of the ecological impact is chosen. The following features of the power plants are used: generated electricity and heat; consumed coal and liquid fuel; sulphur content in fuel; emitted CO2, SO2, NOx, and particles. The linear model is used to construct the Integral Indicator. The model parameters are defined by the Principal Component Analysis. The constructed Integral Indicator is compared with several others, such as Pareto-optimal slicing indicator and Metric indicator. The Integral Indicator keeps as much information about the waste measures of the power plants as possible; it is simple and robust. |
BibTeX: @article{strijov10integral_energy, author = {Vadim V. Strijov and Goran Granic and Jeljko Juric and Branka Jelavic and Sandra Antecevic Maricic}, title = {Integral indicator of ecological impact of the Croatian thermal power plants}, journal = {Energy}, year = {2011}, volume = {36}, number = {7}, pages = {4144-4149}, url = {http://www.sciencedirect.com/science/article/pii/S0360544211002799}, doi = {10.1016/j.energy.2011.04.030} } |
Krymova E.A., Strijov V.V. Feature selection algorithms for linear regression models from finite and countable sets // Factory laboratory, 2011, 77(5) : 63-68. Article Rus |
BibTeX: @article{krymova11algorithmy_zldm, author = {E. A. Krymova and V. V. Strijov}, title = {Feature selection algorithms for linear regression models from finite and countable sets}, journal = {Factory laboratory}, year = {2011}, volume = {77}, number = {5}, pages = {63-68}, url = {http://zldm.ru/content/article.php?ID=1155} } |
Strijov V.V. Specification of rank-scaled expert estimation using measured data // Factory laboratory, 2011, 77(7) : 72-78. Article Rus |
BibTeX: @article{strijov11utochnenie_zldm, author = {Strijov, V. V.}, title = {Specification of rank-scaled expert estimation using measured data}, journal = {Factory laboratory}, year = {2011}, volume = {77}, number = {7}, pages = {72-78}, url = {http://zldm.ru/content/article.php?ID=1186} } |
Kuznetsov M.P., Strijov V.V. Integral Indicators and Expert estimations of Ecological Impact // International Conference on Operations Research, 2011 : 32. InProceedings |
Abstract: To compare objects or alternative decisions one must evaluate a quality of each object. A real-valued scalar, which is corresponded to the object, is called an integral indicator. The integral indicator of the object is a convolution of the object features. Expert estimations of one expert or an expert group could be indicators, too. We consider a problem of indicator construction as following. There is a set of objects, which should be compared according to a certain quality criterion. A set of features describes each object. This two sets are given together with an �object/feature� matrix of measured data. We select the linear model of the convolution: the integral indicator is the linear combination of features and their weights. So, to construct the integral indicator we must find the weights of the given features. To do that we use the expert estimates of both indicators and weights in rank scales. To compute indicators, according to the linear model, one can use the expert set of weights. In the general case the computed indicators do not match the expert estimations of indicators. Our goal is to match the estimated and the computed integral indicators by maximizing a rank correlation between them. We consider the set of the estimated indicators and the set of the estimated weights as two cones in spaces of indicators and weights, respectively. Our goal is to find the set of weights such that the distance between this set and the cone of the expert-given weights must be minimum. Using the found weights we compute the set of integral indicators such that the distance between this computed set and the cone of the expert-given integral indicators must be minimum, as well. This methodology is used for the Clean Development Mechanism project evaluation. The project partners have to prove that their project can yield emission reductions in developing countries, which could not be achieved in the project�s absence. The proposed integral indicators are intended to evaluate the environmental impact of this projects. |
BibTeX: @inproceedings{Kuznetsov2011Integral, author = {Michail P. Kuznetsov and Vadim V. Strijov}, title = {Integral Indicators and Expert estimations of Ecological Impact}, booktitle = {International Conference on Operations Research}, year = {2011}, pages = {32}, url = {/papers/Kuznetsov2011OR.pdf} } |
Strijov V.V. Invariants and model selection in forecasting // International Conference on Operations Research, 2011 : 133. InProceedings |
Abstract: Time series in the financial sector may include annual, weekly and daily periodicals as well as non-periodical events. The energy price and consumed volume time series; the time series of consumer sales volume could be the examples. The generalized linear autoregressive models are used to forecast these time series. The samples of the main time-period of the time series correspond to the features of the forecasting models. To boost the quality of the forecast, two problems must be solved. First, we must select a set of features, which forms the model of optimal quality. Second, we must split the time series on the periodical and eventual segments and assign a model of optimal quality of each type of segments. To solve these problems, we estimate the distribution of the model parameters using coherent Bayesian inference. The optimal model for a given time-segment has the most probable value of maximum evidence, which is estimated under conditions of the stepwise regression: the features are added and deleted from the active feature set towards the evidence maximizing. The splitting procedure includes analysis of the model parameters distributions. Consider two forecasting models that are defined on their non-intersecting consequent time-segments. These models are different if the Kullback-Leibler distance between the distributions of their parameters is statistically significant. In this case the time-segment split is fixed; otherwise we consider the models equal and join the time-segments. The proposed approach brings the most precise time-segment splitting than the dynamic time warping procedure and causes increase of the forecasting quality. As an illustration we discuss the automatic detection of seasonal sales and promotions of consumer goods. |
BibTeX: @inproceedings{Strijov2011Invariants_OR, author = {Vadim V. Strijov}, title = {Invariants and model selection in forecasting}, booktitle = {International Conference on Operations Research}, year = {2011}, pages = {133}, url = {/papers/Strijov2011OR.pdf} } |
Kuznetsov M.P., Strijov V.V. Monotonic interpolation for the rank-scaled expert estimations specification // Proceedings of Mathematical Methods of Pattern Recognition. 2011 : 162-165. InProceedings Rus |
BibTeX: @inproceedings{Kuznetsov-Strijov2011Oblique_mmro, author = {M. P. Kuznetsov and V. V. Strijov}, title = {Monotonic interpolation for the rank-scaled expert estimations specification}, booktitle = {Proceedings of Mathematical Methods of Pattern Recognition}, publisher = {}, year = {2011}, pages = {162-165}, url = {/papers/Kuznetsov2011mmro15.pdf} } |
Pavlov K.V., Strijov V.V. Multilevel model selection in the bank credit scoring applications // Proceedings of Mathematical Methods of Pattern Recognition. 2011 : 158-161. InProceedings Rus |
BibTeX: @inproceedings{Pavlov2011Selection, author = {Pavlov, K. V. and Strijov, V. V.}, title = {Multilevel model selection in the bank credit scoring applications}, booktitle = {Proceedings of Mathematical Methods of Pattern Recognition}, publisher = {}, year = {2011}, pages = {158-161}, url = {/papers/Pavlov2011mmro15.pdf} } |
Strijov V.V. Multilevel model selection using parameters covariance matrix analysis // Proceedings of Mathematical Methods of Pattern Recognition. 2011 : 154-157. InProceedings Rus |
BibTeX: @inproceedings{Strijov11Multimodel_mmro, author = {Strijov, V. V.}, title = {Multilevel model selection using parameters covariance matrix analysis}, booktitle = {Proceedings of Mathematical Methods of Pattern Recognition}, publisher = {}, year = {2011}, pages = {154-157}, url = {/papers/Strijov2011mmro15.pdf} } |
2010Strijov V.V., Weber G.W. Nonlinear regression model generation using hyperparameter optimization // Computers and Mathematics with Applications, 2010, 60(4) : 981-988. Article |
Abstract: An algorithm of the inductive model generation and model selection is proposed to solve the problem of automatic construction of regression models. A regression model is an admissible superposition of smooth functions given by experts. Coherent Bayesian inference is used to estimate model parameters. It introduces hyperparameters which describe the distribution function of the model parameters. The hyperparameters control the model generation process. |
BibTeX: @article{Strijov2010981, author = {Strijov, V. V. and Weber, G. W.}, title = {Nonlinear regression model generation using hyperparameter optimization}, journal = {Computers and Mathematics with Applications}, year = {2010}, volume = {60}, number = {4}, pages = {981-988}, note = {PCO' 2010 - Gold Coast, Australia 2-4th December 2010, 3rd Global Conference on Power Control Optimization}, url = {http://www.sciencedirect.com/science/article/B6TYJ-4YX65PS-1/2/471789368d98fd837f293565dbfc0bbb}, doi = {10.1016/j.camwa.2010.03.021} } |
Strijov V.V. Methods of regression model selection. Moscow, Computing Center RAS, 2010 : 60. Book Rus |
Abstract: Problems of regression analysis could be posed as following. First, a repression model and a data generation hypothesis are given. The data generation hypothesis is the distribution function of the random variable as well as assumptions about properties of the random variable. This problem is the optimization problem of the model parameters. Second, a class of the regression models (linear models, radial basic functions, etc.) is given together with a data generation hypothesis. This problem is the problem of model selection. Third, a class of models and a class of data generation hypothesis are given (for example the exponential family of distributions). To solve this problem one must use residual analysis. |
BibTeX: @book{strijov2010methody_ccas, author = {Vadim V. Strijov}, title = {Methods of regression model selection}, publisher = {Moscow, Computing Center RAS}, year = {2010}, pages = {60}, url = {http://www.machinelearning.ru/wiki/images/5/52/Strijov-Krymova10Model-Selection.pdf} } |
Strijov V.V. Evidence of successively generated models // International Conference on Operations Research "Mastering Complexity", 2010 : 223. InProceedings |
Abstract: Let us investigate an algorithm of regression model construction. The constructed model will be used to solve problems of the Financial Sector: it might be a scoring model, an energy consumption forecast model or European option volatility smile model. We suppose that given historical data are not sufficient to discover hidden dependencies in an investigated problem. So we propose the following approach to the model construction. Together with historical data we use expert-given set of primitive functions. It is recommended to collect functions, which already widely used to model the investigated problem. Then we assign a generating function, which will be used to generate the set of the competitive models. We estimate evidence of the models using coherent Bayesian inference and select a model of the best structure. Since generating functions make a countable set of models, we organize an iterative generation-selection procedure. Each cycle of the procedure include the following steps. First, we modify competitive models so that the structural distance between an original and a derivative model will as minimal as possible. Second, we estimate parameters and hyperparameters of the derivative model to cut-off some model modifications at the following steps and reduce the algorithm complexity. Third, we analyze the evidence of the derivative model to find the probability to become it a model of the optimal structure. Also, we analyze some restrictions applied to the model structure and robustness of the model. As the result we obtain a model, interpretable from the expert�s point-of view; if fits historical data well and robust. Some additional tests are applied to verify the result model: cross-validation and retrospective forecasting to ensure quality of the further use. |
BibTeX: @inproceedings{strijov10evidence_or, author = {Vadim V. Strijov}, title = {Evidence of successively generated models}, booktitle = {International Conference on Operations Research "Mastering Complexity"}, year = {2010}, pages = {223}, url = {/papers/strijov2010OR.pdf} } |
Strijov V.V. Model generation and model selection in credit scoring // 24th European Conference on Operations Research, 2010 : 220. InProceedings |
Abstract: The credit scorecard is the logistic regression model; it maps the feature space to the probability of default of a banking client. A classical scorecard is constructed by an analyst, who manually selects informative features and creates combinations of them. We propose a new technique for the automatic scorecard construction. To develop a scorecard, one must assign a set of primitive functions and model generation rules. The result model is an admissible superposition of the primitive functions and features. The coherent Bayesian inference is used to select features and their superpositions. |
BibTeX: @inproceedings{strijov10model_euro, author = {Vadim V. Strijov}, title = {Model generation and model selection in credit scoring}, booktitle = {24th European Conference on Operations Research}, year = {2010}, pages = {220}, url = {/papers/strijov10ModelGen_EURO.pdf} } |
Strijov V.V., Krymova E.A., Gerhard W.W. Evidence Optimization for Consequently Generated Models // Proceedings of the fourth global conference on power control and optimization, 2010, 1337 : 204-208. InProceedings |
Abstract: We address the problem of segmenting nearly periodic time series into period-like segments. We introduce a definition of nearly periodic time series via triplets hbasic shape, shape transformation, time scalingi that covers a wide range of time series. To split the time series into periods we select a pair of principal components of the Hankel matrix. We then cut the trajectory of the selected principal components by its symmetry axis, thus obtaining half-periods that are merged into segments. We describe a method of automatic selection of periodic pairs of principal components, corresponding to the fundamental periodicity. We demonstrate the application of the proposed method to the problem of period extraction for accelerometric time series of human gait. We see the automatic segmentation into periods as a problem of major importance for human activity recognition problem, since it allows to obtain interpretable segments: each extracted period can be seen as an ultimate entity of gait. The method we propose is more general compared to the application specific methods and can be used for any nearly periodical time series. We compare its performance to classical mathematical methods of period extraction and find that it is not only comparable to the alternatives, but in some cases performs better. Index Terms�sensor signal processing, nearly periodic time series, time series segmentation, period extraction, principal components analysis. |
BibTeX: @inproceedings{Strijov2011Evidence_AIP, author = {Strijov, V. V. and Krymova, E. A. and Gerhard, W. W.}, editor = {Nader Barsoum and Jeffrey Frank Webb and Pandian Vasant}, title = {Evidence Optimization for Consequently Generated Models}, booktitle = {Proceedings of the fourth global conference on power control and optimization}, year = {2010}, volume = {1337}, pages = {204-208}, url = {/papers/strijov-weber2010PCO-3.pdf}, doi = {10.1063/1.3592467} } |
Krymova E.A., Strijov V.V. Model selection and multicollinearity analysis // Proceedings of conference on Intelligent data processing, 2010 : 153-156. InProceedings Rus |
BibTeX: @inproceedings{krymova10vybor_ioi, author = {Krymova, E. A. and Strijov, V. V.}, title = {Model selection and multicollinearity analysis}, booktitle = {Proceedings of conference on Intelligent data processing}, year = {2010}, pages = {153-156}, url = {/papers/Krymova2010Select_IOI.pdf} } |
Skipor K.S., Strijov V.V. Least angle logistic regression // Proceedings of conference on Intelligent data processing, 2010 : 180-183. InProceedings Rus |
BibTeX: @inproceedings{skipor10method_ioi, author = {Skipor, K. S. and Strijov, V. V.}, title = {Least angle logistic regression}, booktitle = {Proceedings of conference on Intelligent data processing}, year = {2010}, pages = {180-183}, url = {/papers/Skipor2010-iip-8.pdf} } |
2009Strijov V.V., Sologub R.A. The inductive generation of the volatility smile models // Journal of Computational Technologies, 2009, 14(5) : 102-113. Article |
Abstract: Volatility of the European-type options depends on their strike and maturity. The authors suppose the volatility smile models based not only expert knowledge, but also on data. The model generation algorithm was proposed. It generates volatility models of the optimal structure inductively using implied volatility data and expert considerations. The models satisfy expert assessments. The Brent Crude Oil option was considered as an example. |
BibTeX: @article{strijov09jct, author = {Strijov, V. V. and Sologub, R. A.}, title = {The inductive generation of the volatility smile models}, journal = {Journal of Computational Technologies}, year = {2009}, volume = {14}, number = {5}, pages = {102-113}, url = {/papers/Strijov09JCT5.pdf} } |
Strijov V.V., Krymova E.A. Algorithms of linear model generation // Mathematics. Computer. Education. Conference Proceedings, 2009. InProceedings |
BibTeX: @inproceedings{krymova09mce, author = {Strijov, V. V. and Krymova, E. A.}, title = {Algorithms of linear model generation}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2009}, url = {/papers/krymova09mce.pdf} } |
Krymova E.A., Strijov V.V. Comparison of the heuristic algorithms for linear regression model selection // Mathematical methods for pattern recognition. Conference proceedings. MAKS Press, 2009 : 145-148. InProceedings |
BibTeX: @inproceedings{krymova09mmro, author = {Krymova, E. A. and Strijov, V. V.}, title = {Comparison of the heuristic algorithms for linear regression model selection}, booktitle = {Mathematical methods for pattern recognition. Conference proceedings}, publisher = {MAKS Press}, year = {2009}, pages = {145-148}, url = {/papers/strijov09MM1_MMRO-14.pdf} } |
Melnikov D.I., Strijov V.V., Anderrva E.Y., Edenharter G. Selection of support object set for robust integral indicator construction // // Mathematical methods for pattern recognition. Conference proceedings. MAKS Press, 2009 : 159-162. InProceedings |
BibTeX: @inproceedings{melnikov09mmro, author = {Melnikov, D. I. and Strijov, V. V. and Anderrva, E. Yu. and Edenharter, G.}, title = {Selection of support object set for robust integral indicator construction}, booktitle = {// Mathematical methods for pattern recognition. Conference proceedings}, publisher = {MAKS Press}, year = {2009}, pages = {159-162}, url = {/papers/strijov09MM2_MMRO-14.pdf} } |
Strijov V.V., Sologub R.A. Generation of the implied volatility models // Mathematics. Computer. Education. Conference Proceedings, 2009. InProceedings |
BibTeX: @inproceedings{sologub09mce, author = {Strijov, V. V. and Sologub, R. A.}, title = {Generation of the implied volatility models}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2009}, url = {/papers/sologub09mce.pdf} } |
Strijov V.V. Model selection using inductively generated set // European Conference on Operational Research EURO-23, 2009 : 114. InProceedings |
Abstract: Model selection is one of the most important subjects of Machine learning. An algorithm of model selection depends on the class of models and on the investigated problems. In the lecture the problems of regression analysis will be observed. Linear as well as nonlinear regression models will be considered. The models are supposed to be inductively generated during the selection process. Properties of Lars, Optimal brain surgery and Bayesian coherent inference algorithms will be analyzed in the light of model selection. |
BibTeX: @inproceedings{strijov09EURO, author = {Strijov, V. V.}, title = {Model selection using inductively generated set}, booktitle = {European Conference on Operational Research EURO-23}, year = {2009}, pages = {114}, url = {/papers/strijov2009EURO23.pdf} } |
Strijov V.V., Granic G.and Juric Z., Jelavic B., Maricic S. Integral Indicator of Ecological Footprint for Croatian Power Plants // HED Energy Forum «Quo Vadis Energija in Times of Climate Change», 2009 : 46. InProceedings |
Abstract: The main goal of this paper is to present the methodology of construction of the Integral Indicator for Croatian Power Plants. The Integral Indicator is necessary to compare Power Plants selected according to a certain criterion. Herewith the criterion of the Ecological Footprint was chosen. TPP and CHP Power Plants were selected. The following features were used: generated electricity and heat; consumed coal and liquid fuel; sulphur content in fuel; emitted CO2, SO2, NOx and particles. To construct the Integral Indicator the linear model were used. The model was tuned by Principal Component Analysis algorithm. The constructed Integral Indicator was compared with several others, such as Pareto-Optimal Slicing Indicator and Metric Indicator. The Integral Indicator keeps as much information about features of the Power Plants as possible; it is simple and robust. |
BibTeX: @inproceedings{strijov09HED, author = {Strijov, V. V. and Granic, G.and Juric, Z. and Jelavic, B. and Maricic, S.A.}, title = {Integral Indicator of Ecological Footprint for Croatian Power Plants}, booktitle = {HED Energy Forum «Quo Vadis Energija in Times of Climate Change»}, year = {2009}, pages = {46}, url = {/papers/IndicatorOfEcoFootprintForCroatianPPs09HED_EIHP.pdf} } |
Strijov V.V. Model generation and model selection // Mathematics. Computer. Education. Conference Proceedings, 2009. InProceedings |
BibTeX: @inproceedings{strijov09mce, author = {Strijov, V. V.}, title = {Model generation and model selection}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2009}, url = {/papers/strijov09mce.pdf} } |
Strijov V.V., Sologub R.A. Algorithm of nonlinear regression model selection by analysis of hyperparameters // Mathematical methods for pattern recognition. Conference proceedings. MAKS Press, 2009 : 184-187. InProceedings |
BibTeX: @inproceedings{strijov09mmro, author = {Strijov, V. V. and Sologub, R. A.}, title = {Algorithm of nonlinear regression model selection by analysis of hyperparameters}, booktitle = {Mathematical methods for pattern recognition. Conference proceedings}, publisher = {MAKS Press}, year = {2009}, pages = {184-187}, url = {/papers/strijov09MM3_MMRO-14.pdf} } |
Strijov V.V. The Inductive Algorithms of Model Generation // SIAM Conference on Computational Science and Engineering, 2009. InProceedings |
Abstract: One of the important problems in scientific data mining is the problem of regression modeling. To make a regression model using measured data a researcher examines set of competitive models and chooses a model of the best quality. Due to the nature of the experiments non-linear models are common in biological simulations. Symbolic regression allows dealing with large sets of non-linear models. In the lecture inductive algorithms for model creation and selection will be discussed. |
BibTeX: @inproceedings{strijov09SIAMcse09, author = {Strijov, V. V.}, title = {The Inductive Algorithms of Model Generation}, booktitle = {SIAM Conference on Computational Science and Engineering}, year = {2009}, url = {/papers/strijov09_SIAM_cse09.pdf} } |
Strijov A.V., Strijov V.V. Specification of the rank-scaled expert estimations // Mathematics. Computer. Education. Conference Proceedings, 2009 : 41. InProceedings |
Abstract: The algorithm of the integral indicators construction is described. It uses rank-scaled expert estimations and an object-feature data matrix. The expert estimations are specified according to the data and additional expert preferences. To construct integral indicators, linear regression methods are involved. The suggested algorithm is compared with the algorithm of linear-scaled expert estimations concordance. |
BibTeX: @inproceedings{strizhov09mce, author = {Strijov, A. V. and Strijov, V. V.}, title = {Specification of the rank-scaled expert estimations}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2009}, pages = {41}, url = {/papers/strizhov09mce.pdf} } |
2008Strijov V.V. The methods for the inductive generation of regression models. Moscow, Computing Center RAS, 2008. Book Rus |
BibTeX: @book{strijov08ln, author = {Strijov, V. V.}, title = {The methods for the inductive generation of regression models}, publisher = {Moscow, Computing Center RAS}, year = {2008}, url = {/papers/strijov08ln.pdf} } |
Bray D., Strijov V.V. Using immune markers for classification of the CVD patients // Intellectual Data Analysis: Abstracts of the International Scientific Conference, 2008 : 49-50. InProceedings |
Abstract: The goal of the investigation is to find an algorithm that successfully separates different groups of patients with Cardio-Vascular Disease. The algorithm must select the most informative features: the markers, which bring the minimal number of the misclassified patients. Four groups of the CVD-patients are considered: A1 (surgery performed), A3 (risk group) and B1, B2 (healthy groups). Each group contained up to 15 patients. Each patient is described with 20 immune markers. Since the number of the patients in the sample is relatively small, the number of the informative markers must not exceed a few to avoid overtraining. The algorithm must process pairs of the classes. |
BibTeX: @inproceedings{bray08ioi, author = {Bray, D. and Strijov, V. V.}, title = {Using immune markers for classification of the CVD patients}, booktitle = {Intellectual Data Analysis: Abstracts of the International Scientific Conference}, year = {2008}, pages = {49-50}, url = {/papers/bray08ioi.pdf} } |
Strijov V.V., Sologub R.A. The inductive generation of the volatility smile models // SIAM Conference on Financial Mathematics and Engineering 2008, 2008 : 21. InProceedings |
Abstract: Volatility of the European-type options depends on their strike and maturity. The authors suppose the volatility smile models based not only the expert knowledge, but also on the measured data. The model generation algorithm was proposed. It generates volatility models of the optimal structure inductively using implied volatility data and expert considerations. The models satisfy expert assessments. The Brent Crude Oil option was considered as an example. |
BibTeX: @inproceedings{sologub08finance, author = {Strijov, V. V. and Sologub, R. A.}, title = {The inductive generation of the volatility smile models}, booktitle = {SIAM Conference on Financial Mathematics and Engineering 2008}, year = {2008}, pages = {21}, url = {/papers/sologub08finance_eng.pdf} } |
Strijov V.V. On the inductive model generation // Intellectual Data Analysis: Abstracts of the International Scientific Conference, 2008 : 220. InProceedings |
Abstract: This talk is devoted to the problem of the automatic model creation in regression analysis. The models are intended for dynamic systems behavior analysis. The theory and the practice of the inductively-generated models will be examined. |
BibTeX: @inproceedings{strijov08ioi, author = {Strijov, V. V.}, title = {On the inductive model generation}, booktitle = {Intellectual Data Analysis: Abstracts of the International Scientific Conference}, year = {2008}, pages = {220}, url = {/papers/strijov08ioi.pdf} } |
Strijov V.V.
Clusterization of multidimensional time-series using dynamic time warping //
Mathematics. Computer. Education. Conference Proceedings, 2008 : 28. InProceedings
[BibTeX] |
BibTeX: @inproceedings{strijov08macoed, author = {Strijov, V. V.}, title = {Clusterization of multidimensional time-series using dynamic time warping}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2008}, pages = {28} } |
Strijov V.V. Estimation of hyperparameters on parametric regression model generation // 9th International Conference on Pattern Recognition and Image Analysis: New Information Technologies, 2008, 2 : 178-181. InProceedings |
Abstract: The problem of the non-linear regression analysis is considered. The algorithm of the inductive model generation is described. The regression model is a superposi- tion of given smooth functions. To estimate the model parameters two-level Bayesian Inference technique was used. It introduces hyperparameters, which describe the dis- tribution function of the model parameters. |
BibTeX: @inproceedings{strijov08roai, author = {Strijov, V. V.}, title = {Estimation of hyperparameters on parametric regression model generation}, booktitle = {9th International Conference on Pattern Recognition and Image Analysis: New Information Technologies}, year = {2008}, volume = {2}, pages = {178-181}, url = {/papers/strijov08roai_source.pdf} } |
Vorontsov K.V., Inyakin A.S., Strijov V.V., Chekhovich Y.V.
MachineLearning.ru: a site, devoted to problems of pattern recognition, forecasting and classification //
Intellectual Data Analysis: the International Scientific Conference, 2008 : 56-58. InProceedings
[BibTeX] |
BibTeX: @inproceedings{vorontsov08ml, author = {Vorontsov, K. V. and Inyakin, A. S. and Strijov, V. V. and Chekhovich, Yu. V.}, title = {MachineLearning.ru: a site, devoted to problems of pattern recognition, forecasting and classification}, booktitle = {Intellectual Data Analysis: the International Scientific Conference}, year = {2008}, pages = {56-58} } |
Vorontsov K.V., Inyakin A.S., Lisitsa A., Strijov V.V., Khachay M.Y., Chekhovich Y.V.
Proof-ground for classification algorithms: the distributed computing system //
Intellectual Data Analysis: the International Scientific Conference, 2008 : 54-56. InProceedings
[BibTeX] |
BibTeX: @inproceedings{vorontsov08polygon, author = {Vorontsov, K. V. and Inyakin, A. S. and Lisitsa, A. and Strijov, V. V. and Khachay, M. Yu. and Chekhovich, Yu. V.}, title = {Proof-ground for classification algorithms: the distributed computing system}, booktitle = {Intellectual Data Analysis: the International Scientific Conference}, year = {2008}, pages = {54-56} } |
Gushchin A.V., Strijov V.V. An algorithm on the expert estimations objectification with measured data // Intellectual Data Analysis: the International Scientific Conference, 2008 : 78-79. InProceedings Rus |
BibTeX: @inproceedings{gushchin08ioi, author = {Gushchin, A. V. and Strijov, V. V.}, title = {An algorithm on the expert estimations objectification with measured data}, booktitle = {Intellectual Data Analysis: the International Scientific Conference}, year = {2008}, pages = {78-79}, url = {/papers/gushchin08ioi.pdf} } |
Sologub R.A., Strijov V.V. The inductive construction of the volatility regression models // Intellectual Data Analysis: the International Scientific Conference Proceedings, 2008 : 215-216. InProceedings Rus |
BibTeX: @inproceedings{sologub08ioi, author = {Sologub, R. A. and Strijov, V. V.}, title = {The inductive construction of the volatility regression models}, booktitle = {Intellectual Data Analysis: the International Scientific Conference Proceedings}, year = {2008}, pages = {215-216}, url = {/papers/sologub08ioi.pdf} } |
2007Strijov V.V. The search for a parametric regression model in an inductive-generated set // Journal of Computational Technologies, 2007, 1 : 93-102. Article |
Abstract: The procedure of the search for a regression model is described. The model set is a set of superpositions of smooth functions. The model parameters estimations are used in the search. A model of pressure in a spray chamber of a combustion engine illustrates the approach. In this paper one of the important parts of the proposed project is described. |
BibTeX: @article{strijov07jct, author = {Strijov, V. V.}, title = {The search for a parametric regression model in an inductive-generated set}, journal = {Journal of Computational Technologies}, year = {2007}, volume = {1}, pages = {93-102}, url = {/papers/strijov06poisk_jct_en.pdf} } |
Strijov V.V., Kazakova T.V. Stable indices and the choice of a support description set // Zavodskaya Laboratoriya, 2007, 7 : 72-76. Article Rus |
Abstract: This paper describes an integral indicator construction algorithm. The integral indicator is a linear combination of object features. The features are linear-scaled. Outliers among the objects are supposed. The problem of the stable integral indicators construction is posed and solved. To construct the stable integral indicator, a special-defined subset of objects is selected. A nonsupervised algorithm is used to make the integral indicator. The proposed algorithm used to construct an integral indicator of the foodstuff pollution level in Russian regions. |
BibTeX: @article{strijov07stable, author = {Strijov, V. V. and Kazakova, T. V.}, title = {Stable indices and the choice of a support description set}, journal = {Zavodskaya Laboratoriya}, year = {2007}, volume = {7}, pages = {72-76}, url = {/papers/stable_idx4zavlab_after_recenz.pdf} } |
Strijov V.V., Ptashko G.O. Algorithms of the optimal regression model selection. Computing Center of the Russian Academy of Sciences, 2007 : 56. Book Rus |
Abstract: A model is defined by a superposition of the smooth functions. The probability density functions of the model parameters are used. The parameters are estimated with non-linear optimization methods. A problem of the diesel engine pressure modelling presents an application of the method. The parametric and non-parametric approaches to model generation are examined. The prototype of the proposed software is described. |
BibTeX: @book{strijov06occam, author = {Strijov, V. V. and Ptashko, G. O.}, title = {Algorithms of the optimal regression model selection}, publisher = {Computing Center of the Russian Academy of Sciences}, year = {2007}, pages = {56}, url = {/papers/occam.pdf} } |
Strijov V.V., Ptashko G.O. The invariants of time series and dynamic time warping // Proc. Mathematical Methods of Pattern Recognition, 2007 : 212-214. InProceedings Rus |
Abstract: Two methods of the regression models usage were compared: the direct regression model and the approximation of the Minimum Cost Path in the Dynamic Time Warping. |
BibTeX: @inproceedings{strijov07invariants, author = {Strijov, V. V. and Ptashko, G. O.}, title = {The invariants of time series and dynamic time warping}, booktitle = {Proc. Mathematical Methods of Pattern Recognition}, year = {2007}, pages = {212-214}, url = {/papers/strijov_MM_1.pdf} } |
Strijov V.V., Kazakova T.V. The rank-scaled expert estimations concordance // Proc. Mathematical Methods of Pattern Recognition, 2007 : 209-211. InProceedings Rus |
Abstract: Regression model with restrictions, defined by experts, were described. The new method of multivariate regression modelling was proposed. |
BibTeX: @inproceedings{strijov07object, author = {Strijov, V. V. and Kazakova, T. V.}, title = {The rank-scaled expert estimations concordance}, booktitle = {Proc. Mathematical Methods of Pattern Recognition}, year = {2007}, pages = {209-211}, url = {/papers/strijov_MM_2.pdf} } |
Ivakhnenko A.A., Kanevskiy D.Y., Rudeva A.V., Strijov V.V. How to compare marked time-series // Proc. Mathematical Methods of Pattern Recognition, 2007 : 134-137. InProceedings Rus |
Abstract: The multi-model regression markup method was described. The markups were used for classification of financial time series. |
BibTeX: @inproceedings{strijov07timeseries, author = {Ivakhnenko, A. A. and Kanevskiy, D. Yu. and Rudeva, A. V. and Strijov, V. V.}, title = {How to compare marked time-series}, booktitle = {Proc. Mathematical Methods of Pattern Recognition}, year = {2007}, pages = {134-137}, url = {/papers/strijov_MM_AS_4.pdf} } |
2006Strijov V.V. The search for regression models in an inductive-generated set // Artificial intelligence, 2006, 2 : 234-237. Article Rus |
Abstract: The usage of Bayesian inference for the inductive-generated models was described. The algorithm of the arbitrary superpositions of the regression models was introduced. The algorithm uses hyperparameters to estimate the importance of model elements. |
BibTeX: @article{strijov06AI, author = {Strijov, V. V.}, title = {The search for regression models in an inductive-generated set}, journal = {Artificial intelligence}, year = {2006}, volume = {2}, pages = {234-237}, url = {/papers/strijov06AI.pdf} } |
Kazakova T.V., Strijov V.V. The robust indicators with normalising functions selection // Artificial intelligence, 2006, 1 : 160-163. Article Rus |
Abstract: The problem of the stable integral indicators is considered. The objects are linear-scaled. To construct a stable integral indicator one has to choose a subset such that the objects in the set bring the maximal value to the criterion of stability. A method of the feature selection according to the regression model robustness was introduced. |
BibTeX: @article{strijov06AIidx, author = {Kazakova, T. V. and Strijov, V. V.}, title = {The robust indicators with normalising functions selection}, journal = {Artificial intelligence}, year = {2006}, volume = {1}, pages = {160-163}, url = {/papers/strijov06AIidx.pdf} } |
Strijov V.V. Specification of expert estimations using measured data // Factory Laboratory, 2006, 72(7) : 59-64. Article Rus |
Abstract: To construct stable integral indicators we will use expert estimations of object features. The indicators are linear combinations of the features. Their values is corrected with the expert estimations. A new method of multivariate regression is described. The model parameters are specified by expert estimations. |
BibTeX: @article{strijov06utochnenie_zldm, author = {Strijov, V. V.}, title = {Specification of expert estimations using measured data}, journal = {Factory Laboratory}, year = {2006}, volume = {72(7)}, pages = {59-64}, url = {/papers/strijov06precise.pdf} } |
Strijov V.V. Vsevolod Vladimirovich Shakin // Mathematics. Computer. Education. Conference Proceedings. Regular and chaotic dynamics, 2006, 1 : 5-16. InCollection Rus |
BibTeX: @incollection{strijov06shakin, author = {Strijov, V. V.}, editor = {Riznichenko, G. Yu.}, title = {Vsevolod Vladimirovich Shakin}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, publisher = {Regular and chaotic dynamics}, year = {2006}, volume = {1}, pages = {5-16}, url = {/papers/VsevolodShakin06paper.pdf} } |
Strijov V.V. Indices construction using linear and ordinal expert estimations // Citizens and Governance for Sustainable Development, 2006 : 49. InProceedings |
Abstract: Indices are necessary to compare objects united in a set according to a certain criterion. For example, the objects are national protected areas or power plants. An index is a number, which is corresponded to an object. In this research an algorithm for construction of quality indices using expert estimations is developed. Consider an indices construction problem. A set of comparable objects and a set of features are given together with an �object-feature� matrix of measured data. Expert estimations of indices and estimations of importance features are given. A model of indices computation is chosen. In the general case the computed indices don�t coincide with the expert estimates of the indices. The computed importance weights don�t coincide with the expert estimations of importance weights, too. One has to compute indices, which are based on measured data with the condition: the indices must not contradict given expert estimations. There two approaches to the problem were suggested. The first one is the unsupervised indices construction. It finds the model parameters such that provide the maximal value of a selfdescriptiveness criterion. The second approach is the supervised indices construction. The model parameters were set such that provide the minimal value of the distance between the computed indices and their expert estimations. Now the third approach is proposed. According to this approach the experts can resolve the contradiction between expert estimations of indices, importance weights and measured data. At that, there is a hyperparameter embedded in the model. Its value corresponds to importance either the indices or the feature weights. |
BibTeX: @inproceedings{strijo06sigsud, author = {Strijov, V. V.}, title = {Indices construction using linear and ordinal expert estimations}, booktitle = {Citizens and Governance for Sustainable Development}, year = {2006}, pages = {49}, url = {/papers/strijo06Abstract_SIGSUD_RuEng.pdf} } |
Kazakova T.V., Strijov V.V. The robust indicators with normalising functions selection // International Scientific Conference on Artificial Intelligence, 2006 : 199. InProceedings Rus |
BibTeX: @inproceedings{kazakova06ioi, author = {Kazakova, T. V. and Strijov, V. V.}, title = {The robust indicators with normalising functions selection}, booktitle = {International Scientific Conference on Artificial Intelligence}, year = {2006}, pages = {199}, url = {/papers/strijov_kazakova2006ioi.pdf} } |
Strijov V.V. The search for regression models in a set of smooth functions // Mathematics. Computer. Education. Conference Proceedings, 2006. InProceedings Rus |
BibTeX: @inproceedings{strijov06mce, author = {Strijov, V. V.}, title = {The search for regression models in a set of smooth functions}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2006}, url = {/papers/strijov06mce.pdf} } |
Strijov V.V. The search for regression models in an inductive-generated set // International Scientific Conference on Artificial Intelligence, 2006 : 198. InProceedings Rus |
BibTeX: @inproceedings{strijov2006ioi, author = {Strijov, V. V.}, title = {The search for regression models in an inductive-generated set}, booktitle = {International Scientific Conference on Artificial Intelligence}, year = {2006}, pages = {198}, url = {/papers/strijov2006ioi.pdf} } |
Strijov V.V., Kazakova T.V. Robust indicators and selection of support objects // Multivariate statistical analysis applications in economics and quality assessment. VIII-th International Conference, 2006. InProceedings Rus |
BibTeX: @inproceedings{strijovkazakova06CEMI, author = {Strijov, V. V. and Kazakova, T. V.}, title = {Robust indicators and selection of support objects}, booktitle = {Multivariate statistical analysis applications in economics and quality assessment. VIII-th International Conference}, year = {2006}, url = {/papers/strijovkazakova06CEMI.pdf} } |
2005Kazakova T.V., Strijov V.V. Stable integral indices // Proc. Mathematical Methods of Pattern Recognition, 2005 : 206. InProceedings Rus |
BibTeX: @inproceedings{kazakova05mmro, author = {Kazakova, T. V. and Strijov, V. V.}, title = {Stable integral indices}, booktitle = {Proc. Mathematical Methods of Pattern Recognition}, year = {2005}, pages = {206}, url = {/papers/kazakova05mmro.pdf} } |
Ptashko G.O., Strijov V.V., Shakin V.V. Specification of ordinal expert estimations // Mathematics. Computer. Education. Conference Proceedings, 2005. InProceedings Rus |
BibTeX: @inproceedings{ptashko05macoed, author = {Ptashko, G. O. and Strijov, V. V. and Shakin, V. V.}, title = {Specification of ordinal expert estimations}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2005}, url = {/papers/macoed05_2.pdf} } |
Ptashko G.O., Strijov V.V. The distance function choice for the phase trajectories comparison // Proc. Mathematical Methods of Pattern Recognition, 2005 : 116-119. InProceedings Rus |
Abstract: The method of the regression model comparison is examined. ��� ������� ����� ����������� ����������� ��������� �������� ��������� ������� ���������� ������� ������� ���������. ��������������, ��� �������� � ���������� ��������� ����� ������� ����������. ��������� ����� ������� ���������� ����� ������������, ������� �� ������������� �������� ��������, ����������� ���������. � ������� ���� ������� ��������� ������� ������ ���������� ����� ������������ ��� ����������� ������������� ��������� �� ����� ��������. |
BibTeX: @inproceedings{ptashko05mmro, author = {Ptashko, G. O. and Strijov, V. V.}, title = {The distance function choice for the phase trajectories comparison}, booktitle = {Proc. Mathematical Methods of Pattern Recognition}, year = {2005}, pages = {116-119}, url = {/papers/ptashko05mmro.pdf} } |
Strijov V.V., Shakin V.V. Selection of optimal regression model // Mathematics. Computer. Education. Conference Proceedings, 2005. InProceedings Rus |
BibTeX: @inproceedings{strijov05macoed, author = {Strijov, V. V. and Shakin, V. V.}, title = {Selection of optimal regression model}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2005}, url = {/papers/macoed05_2.pdf} } |
Strijov V.V. How to select a nonlinear regression model of optimal complexity? // Proc. Mathematical Methods of Pattern Recognition, 2005 : 190-191. InProceedings Rus |
Abstract: A model of optimal complexity was chosen from a set of several thousand inductively-generated models. The Bayesian inference was used. |
BibTeX: @inproceedings{strijov05mmro, author = {Strijov, V. V.}, title = {How to select a nonlinear regression model of optimal complexity?}, booktitle = {Proc. Mathematical Methods of Pattern Recognition}, year = {2005}, pages = {190-191}, url = {/papers/strijov05mmro.pdf} } |
2003Strijov V.V., Shakin V.V. Index construction: the expert-statistical method // Environmental research, engineering and management, 2003, 26(4) : 51-55. Article |
Abstract: This paper deals with the index construction and presents a new technique that involves expert estimations of object indices as well as feature significance weights. An index is calculated as a linear combination of the object�s features. Non-supervised methods of the index construction are observed to be compared with the new method. Experts can estimate the index and verify the results. The results are precise valid indices and the reasoned expert estimations. This technique was used in various economical, sociological, and ecological applications. This paper introduces a method of multivariate regression model construction. Here an integral indicator is a regression model with applied restrictions. |
BibTeX: @article{strijov03index, author = {Strijov, V. V. and Shakin, V. V.}, title = {Index construction: the expert-statistical method}, journal = {Environmental research, engineering and management}, year = {2003}, volume = {26}, number = {4}, pages = {51-55}, note = {ISSN 1392-1649}, url = {/papers/10-v_strijov.pdf} } |
Strijov V.V., Shakin V.V. Index construction: the expert-statistical method // Proc. Conference on Sustainability Indicators and Intelligent Decisions, 2003 : 56-57. InProceedings |
Abstract: There are lots of ways to construct indices. However, when algorithms are chosen and some results obtained, the following question arises: How to show adequacy of the calculated indices? To answer the question analysts invite experts. The experts express their opinion and then the second question arises: How to show that expert estimations are valid? |
BibTeX: @inproceedings{strijov03siid, author = {Strijov, V. V. and Shakin, V. V.}, title = {Index construction: the expert-statistical method}, booktitle = {Proc. Conference on Sustainability Indicators and Intelligent Decisions}, year = {2003}, pages = {56-57}, url = {/papers/siid03.pdf} } |
Strijov V.V., Shakin V.V. Forecast and control with autoregressive models // Proc. Mathematical Methods of Pattern Recognition conference, 2003 : 178-181. InProceedings Rus |
Abstract: An autoregressive model is represented as the model of dynamic system behavior. One can control the system state using the inverse regression model. The authors use time series to verify the models. ��������� ����������������� ������ � ������ �� ������ ������������� ��������� �������� ������������ ������������� ������������������� �������. ����� ���� ��������� ������ �������������� �������� �������� ������������������ ����������� ���������� ��������� � �������������� ������� �������� ������������� ���������. � ������ ������ ��� �������� ������������ ��������-����������������� ������, ������������ ����� �������, ��� �������� ���������� ������� ������� �� ������ �� ����������, ��������� �����������, �� �, � ���������, �� �������� ����������. ����� ������ ��������� ����� ����������� ����������� ����������� � ��������������� ��������� ������� ���������� ��� ����������� ����������. |
BibTeX: @inproceedings{strijov03prognoz, author = {Strijov, V. V. and Shakin, V. V.}, title = {Forecast and control with autoregressive models}, booktitle = {Proc. Mathematical Methods of Pattern Recognition conference}, year = {2003}, pages = {178-181}, url = {/papers/mmro11.pdf} } |
Aivazian S.A., Strijov V.V., Shakin V.V. On a problem of macroeconomics management. Computing Center RAS. Computing Center of the Russian Academy of Sciences, 2003. TechReport Rus |
Abstract: In this paper the application of autoregressive models is considered. The models are used to control the macroeconomic system so that the system obtained a given state. The quality of the control was defined as an integral indicator. |
BibTeX: @techreport{aivazian03macro, author = {Aivazian, S. A. and Strijov, V. V. and Shakin, V. V.}, title = {On a problem of macroeconomics management}, publisher = {Computing Center of the Russian Academy of Sciences}, school = {Computing Center RAS}, year = {2003}, url = {/papers/macro1.pdf} } |
2002Strijov V.V. Expert estimations concordance for biosystems under extreme conditions. Notes on applied mathematics. Moscow, Coumpiting Center of RAS, 2002. Book Rus |
BibTeX: @book{Strijov2002Extreme, author = {Strijov, V. V.}, title = {Expert estimations concordance for biosystems under extreme conditions. Notes on applied mathematics}, publisher = {Moscow, Coumpiting Center of RAS}, year = {2002}, url = {/papers/strijov280502.pdf} } |
Molak V., Strijov V.V., Shakin V.V. Kyoto-Index for power plants in the USA // Mathematics. Computer. Education. Conference Proceedings, 2002 : 292. InProceedings Rus |
BibTeX: @inproceedings{molak02usa, author = {Molak, V. and Strijov, V. V. and Shakin, V. V.}, title = {Kyoto-Index for power plants in the USA}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2002}, pages = {292}, url = {/papers/kimacoed02.pdf} } |
Strijov V.V., Shakin V.V. Rank-scaled expert estimations concordance // International Scientific Conference on Artificial Intelligence, 2002 : 82-83. InProceedings Rus |
BibTeX: @inproceedings{strijov02ioi, author = {Strijov, V. V. and Shakin, V. V.}, title = {Rank-scaled expert estimations concordance}, booktitle = {International Scientific Conference on Artificial Intelligence}, year = {2002}, pages = {82-83}, url = {/papers/ioi2002.pdf} } |
Strijov V.V., Shakin V.V. Rank-scaled expert estimations processing // Mathematics. Computer. Education. Conference Proceedings, 2002 : 148. InProceedings Rus |
BibTeX: @inproceedings{strijov02macoed, author = {Strijov, V. V. and Shakin, V. V.}, title = {Rank-scaled expert estimations processing}, booktitle = {Mathematics. Computer. Education. Conference Proceedings}, year = {2002}, pages = {148}, url = {/papers/MaCoEd2002.pdf} } |
Strijov V.V. Specification of expert estimations for integral indicators construction (thesis abstract). Computing Center of the Russian Academy of Sciences, 2002 : 24. PhdThesis Rus |
BibTeX: @phdthesis{strijov02phdreferat, author = {Strijov, V. V.}, title = {Specification of expert estimations for integral indicators construction (thesis abstract)}, school = {Computing Center of the Russian Academy of Sciences}, year = {2002}, pages = {24}, note = {Author's abstract}, url = {/papers/concorda.pdf} } |
Strijov V.V. Specification of expert estimations for integral indicators construction (thesis manuscript). Computing Center of the Russian Academy of Sciences, 2002 : 105. PhdThesis Rus |
Abstract: The supervised and non-supervised index construction methods are investigated. The expert estimation concordance problem is represented as a problem of regression analysis. |
BibTeX: @phdthesis{strijov02phdthesis, author = {Strijov, V. V.}, title = {Specification of expert estimations for integral indicators construction (thesis manuscript)}, school = {Computing Center of the Russian Academy of Sciences}, year = {2002}, pages = {105}, url = {/papers/concordt.pdf} } |
Strijov V.V., et al. Methodology elements of the university research effectiveness estimations. Part 1.. Computing Center of the Russian Academy of Sciences, 2002 : 7. TechReport Rus |
BibTeX: @techreport{strijov02effect1, author = {Strijov, V. V. and et al.}, title = {Methodology elements of the university research effectiveness estimations. Part 1.}, school = {Computing Center of the Russian Academy of Sciences}, year = {2002}, pages = {7}, url = {/papers/part1ver1.pdf} } |
Strijov V.V., et al. Methodology elements of the university research effectiveness estimations. Part 2.. Computing Center of the Russian Academy of Sciences, Ministry of Education, 2002 : 7. TechReport Rus |
Abstract: In this paper a method of the discrete regression model based on expert estimations was described. |
BibTeX: @techreport{strijov02effect2, author = {Strijov, V. V. and et al.}, title = {Methodology elements of the university research effectiveness estimations. Part 2.}, school = {Computing Center of the Russian Academy of Sciences, Ministry of Education}, year = {2002}, pages = {7}, url = {/papers/part2ver2.pdf} } |
Strijov V.V., Shakin V.V. Ordering of mixed-scaled objects. Computing Center of the Russian Academy of Sciences, 2002 : 8. TechReport Rus |
BibTeX: @techreport{strijov02ranks, author = {Strijov, V. V. and Shakin, V. V.}, title = {Ordering of mixed-scaled objects}, school = {Computing Center of the Russian Academy of Sciences}, year = {2002}, pages = {8}, url = {/papers/multiscales_indicatros.pdf} } |
Strijov V.V. Time management for development of electronic devices. Computing Center of the Russian Academy of Sciences, 2002 : 2. TechReport Rus |
BibTeX: @techreport{strijov02RnDelektron, author = {Strijov, V. V.}, title = {Time management for development of electronic devices}, school = {Computing Center of the Russian Academy of Sciences}, year = {2002}, pages = {2}, url = {/papers/RnD_elektron.pdf} } |
2001Karioukhin E.V., Shakin V.V., Strijov V.V., Matunin E.S., Izgacheva T.S., Kazakova T.V. Mathematical modelling of gerontology-support organizations // The Clinical Gerontology. Scientific Journal. Moscow: Newdiamed, 2001, 7(8) : 89. Article[BibTeX] |
BibTeX: @article{karioukhin01model, author = {Karioukhin, E. V. and Shakin, V. V. and Strijov, V. V. and Matunin, E. S. and Izgacheva, T. S. and Kazakova, T. V.}, title = {Mathematical modelling of gerontology-support organizations}, journal = {The Clinical Gerontology. Scientific Journal}, publisher = {Moscow: Newdiamed}, year = {2001}, volume = {7}, number = {8}, pages = {89} } |
Strijov V.V., Shakin V.V. An algorithm for clustering of the phase trajectory of a dynamic system // Mathematical Communications, Supplement, 2001, 1 : 159-165. Article |
Abstract: This paper describes an algorithm of quantitative analysis of the dynamic system behavior. The system behavior is represented as a multivariate phase trajectory. The algorithm clusters the trajectory to satisfy the requirement of the local space dimension. The set of the clusters is represented as an unbalanced tree. The phase trajectory of the Lorenz attractor is examined as a test problem to demonstrate the algorithm. The analysis is intended to describe the behavior of various living systems. The method of the space reduction for the piecewise linear regression models was proposed. |
BibTeX: @article{strijov01clustering, author = {Strijov, V. V. and Shakin, V. V.}, title = {An algorithm for clustering of the phase trajectory of a dynamic system}, journal = {Mathematical Communications, Supplement}, year = {2001}, volume = {1}, pages = {159-165}, url = {/papers/koi2000a.pdf} } |
Strijov V.V. Bidirectional CBT chips application // Schemotechnics, 2001, 2 : 18-19. Article Rus |
BibTeX: @article{strijov01cbt, author = {Strijov, V. V.}, title = {Bidirectional CBT chips application}, journal = {Schemotechnics}, year = {2001}, volume = {2}, pages = {18-19}, url = {/papers/cbt.pdf} } |
Strijov V.V. CMOS buffers // Schemotechnics, 2001, 2 : 20-21. Article Rus |
BibTeX: @article{strijov01cmos, author = {Strijov, V. V.}, title = {CMOS buffers}, journal = {Schemotechnics}, year = {2001}, volume = {2}, pages = {20-21}, url = {/papers/kmop.pdf} } |
Strijov V.V. Live plug-in // Schemotechnics, 2001, 5 : 15-18. Article Rus |
BibTeX: @article{strijov01live, author = {Strijov, V. V.}, title = {Live plug-in}, journal = {Schemotechnics}, year = {2001}, volume = {5}, pages = {15-18}, url = {/papers/s15-18.pdf} } |
Matunin E.S., Izgacheva T.S., Kazakova T.V., Karioukhin E.V., Strijov V.V., Shakin V.V.
Mathematical modelling and informational support for gerontology organizations.
Computing Center of the Russian Academy of Sciences, 2001 : 79. Book
[BibTeX] |
BibTeX: @book{matunin01ccas, author = {Matunin, E. S. and Izgacheva, T. S. and Kazakova, T. V. and Karioukhin, E. V. and Strijov, V. V. and Shakin, V. V.}, title = {Mathematical modelling and informational support for gerontology organizations}, publisher = {Computing Center of the Russian Academy of Sciences}, year = {2001}, pages = {79} } |
Molak V., Shakin V.V., Strijov V.V.
Kyoto Index for power plants in the USA //
The 3-rd Moscow International Conference On Operations Research, 2001 : 80. InProceedings
[BibTeX] |
BibTeX: @inproceedings{molak01kioto, author = {Molak, V. and Shakin, V. V. and Strijov, V. V.}, title = {Kyoto Index for power plants in the USA}, booktitle = {The 3-rd Moscow International Conference On Operations Research}, year = {2001}, pages = {80} } |
Zubarevich N.V., Tikunov V.S., Krepets V.V., Strijov V.V., Shakin V.V.
Multivariate methods for human development index estimation in Russian regions //
GIS for area sustainable development. International Conference Proceedings, 2001 : 84-105. InProceedings
[BibTeX] |
BibTeX: @inproceedings{tikunov01gis, author = {Zubarevich, N. V. and Tikunov, V. S. and Krepets, V. V. and Strijov, V. V. and Shakin, V. V.}, title = {Multivariate methods for human development index estimation in Russian regions}, booktitle = {GIS for area sustainable development. International Conference Proceedings}, year = {2001}, pages = {84-105} } |
Strijov V.V., Shakin V.V., Blagovidov K.V. Concordance of expert estimations for analysis of protected areas management effectiveness // Multivariate statistics analysis applications in economics and quality estimation, 2001 : 30. InProceedings Rus |
BibTeX: @inproceedings{strijov01cemi, author = {Strijov, V. V. and Shakin, V. V. and Blagovidov, K. V.}, title = {Concordance of expert estimations for analysis of protected areas management effectiveness}, booktitle = {Multivariate statistics analysis applications in economics and quality estimation}, year = {2001}, pages = {30}, url = {/papers/cemi2001.pdf} } |
Strijov V.V., Shakin V.V. Expert estimations concordance // Proc. Mathematical Methods of Pattern Recognition, 2001 : 137-138. InProceedings Rus |
BibTeX: @inproceedings{strijov01mmro, author = {Strijov, V. V. and Shakin, V. V.}, title = {Expert estimations concordance}, booktitle = {Proc. Mathematical Methods of Pattern Recognition}, year = {2001}, pages = {137-138}, url = {/papers/mmro10.pdf} } |
Strijov V.V., Shakin V.V., Blagovidov K.V. Analysis of protected areas management effectiveness. Computing Center of the Russian Academy of Sciences, 2001 : 11. TechReport Rus |
BibTeX: @techreport{strijov01ar, author = {Strijov, V. V. and Shakin, V. V. and Blagovidov, K. V.}, title = {Analysis of protected areas management effectiveness}, school = {Computing Center of the Russian Academy of Sciences}, year = {2001}, pages = {11}, note = {Lecture notes}, url = {/papers/cemi01ar.pdf} } |
Strijov V.V., Shakin V.V. Analysis of a dynamic system phase trajectory. Computing Center of the Russian Academy of Sciences, 2001 : 5. TechReport Rus |
BibTeX: @techreport{strijov01clusteringrus, author = {Strijov, V. V. and Shakin, V. V.}, title = {Analysis of a dynamic system phase trajectory}, school = {Computing Center of the Russian Academy of Sciences}, year = {2001}, pages = {5}, url = {/papers/lorenz-ru.pdf} } |
Strijov V.V., Shakin V.V., Blagovidov K.V. A model of the Protected Areas Management. Computing Center of the Russian Academy of Sciences, World Wide Fund For Nature, 2001 : 7. TechReport Rus |
BibTeX: @techreport{strijov01model, author = {Strijov, V. V. and Shakin, V. V. and Blagovidov, K. V.}, title = {A model of the Protected Areas Management}, school = {Computing Center of the Russian Academy of Sciences, World Wide Fund For Nature}, year = {2001}, pages = {7}, note = {Manuscript}, url = {/papers/pamodel.pdf} } |
2000Strijov V.V. Square pulse generators with CMOS ICs // Schemotechnics, 2000, 3 : 25-26. Article Rus |
BibTeX: @article{strijov00cmosgen, author = {Strijov, V. V.}, title = {Square pulse generators with CMOS ICs}, journal = {Schemotechnics}, year = {2000}, volume = {3}, pages = {25-26}, url = {/papers/gen_prym.pdf} } |
Strijov V.V.
The IC behavior under low-voltage //
Schemotechnics, 2000,
2 : 32-33. Article Rus
[BibTeX] |
BibTeX: @article{strijov00lowvoltage, author = {Strijov, V. V.}, title = {The IC behavior under low-voltage}, journal = {Schemotechnics}, year = {2000}, volume = {2}, pages = {32-33} } |
Strijov V.V. The simplest PCI interface // Schemotechnics, 2000, 1 : 55-57. Article Rus |
BibTeX: @article{strijov00pci, author = {Strijov, V. V.}, title = {The simplest PCI interface}, journal = {Schemotechnics}, year = {2000}, volume = {1}, pages = {55-57}, url = {/papers/pci.pdf} } |
Strijov V.V. Logic IC with 3V power supply // Schemotechnics, 2000, 3 : 14-15. Article Rus |
BibTeX: @article{strijov00treevolt, author = {Strijov, V. V.}, title = {Logic IC with 3V power supply}, journal = {Schemotechnics}, year = {2000}, volume = {3}, pages = {14-15}, url = {/papers/log_micro.pdf} } |
Strijov V.V., Shakin V.V. An algorithm for clustering of the phase trajectory of a dynamic system // 8-th International Conference on Operational Research, KOI-2000, 2000 : 35. InProceedings |
Abstract: This paper describes an approach to quantitative analysis of multivariate dynamic system in phase space. The system is used as mathematical model for various living systems. The model is used in various applications. One of the related problems is to represent a phase trajectory as a sequence of clusters to classify the system's state. |
BibTeX: @inproceedings{strijov00koi, author = {Strijov, V. V. and Shakin, V. V.}, title = {An algorithm for clustering of the phase trajectory of a dynamic system}, booktitle = {8-th International Conference on Operational Research, KOI-2000}, year = {2000}, pages = {35}, url = {/papers/koi2000.pdf} } |
1999Strijov V.V., Shakin V.V. Phase trajectory analysis software // Proc. Mathematical Methods of Pattern Recognition, 1999 : 227-230. InProceedings |
BibTeX: @inproceedings{strijov99soft, author = {Strijov, V. V. and Shakin, V. V.}, title = {Phase trajectory analysis software}, booktitle = {Proc. Mathematical Methods of Pattern Recognition}, year = {1999}, pages = {227-230}, url = {/papers/mmro9.pdf} } |
Strijov V.V. Phase trajectory analysis software and its applications // Problems of the complex system safety control. VII-th International Conference Proceedings, 1999 : 156-157. InProceedings Rus |
BibTeX: @inproceedings{strijov99rggu, author = {Strijov, V. V.}, title = {Phase trajectory analysis software and its applications}, booktitle = {Problems of the complex system safety control. VII-th International Conference Proceedings}, year = {1999}, pages = {156-157}, url = {/papers/safety99.pdf} } |
1997Strijov V.V. Motorola IC for TV, video and multimedia overview. Moscow: Motorola GmbH, 1997 : 75. Book Rus |
BibTeX: @book{strijov97motorola, author = {Strijov, V. V.}, title = {Motorola IC for TV, video and multimedia overview}, publisher = {Moscow: Motorola GmbH}, year = {1997}, pages = {75}, url = {/papers/motmult.pdf} } |
1996Strijov V.V. Configurable processors for biomedical data visualizing // Biosystems under extreme conditions. Computing Center of the Russian Academy of Sciences, 1996 : 47-50. InCollection Rus |
BibTeX: @incollection{strijov08biorecon, author = {Strijov, V. V.}, editor = {Shakin, V.}, title = {Configurable processors for biomedical data visualizing}, booktitle = {Biosystems under extreme conditions}, publisher = {Computing Center of the Russian Academy of Sciences}, year = {1996}, pages = {47-50}, url = {/papers/biorecon.pdf} } |