To avoid duplication of the same methods applied to similar problems please let me know what you try to do; send me an email with your name, software name, method and data, and I shall put it on this page and let the others know that it is taken.
- Describe the method in details on a Web page or in a formal report; description should be sufficiently detailed to explain the algorithm, understand what the software is doing and how to interpert it.
- Find software that implements it and describe it briefly.
If you choose a larger software package, like XGobi, select one or two methods, don't describe the whole package!
- Find some interesting data: either generate some data yourself, or grab some data from the network.
Some interesting data sets for visualization are here;
please let me know if you find more interesting data repositories, I shall add it there.
- Prepare your data (change of format or standardization may be needed), and visualize it
- Analyse the usefulness of the results; what can we learn from this excercise?
Perhaps some hypothesis can be proved/disproved, perhaps interesting structure in data will be noticed.
Do not focus your work on accuracy only, or on new methods and their improvement - assignemnt is not a research report!
The main goal for this assignment is to learn something about the data, not about the methods.
Try to find a simplest description: most informative features, rules from trees, associations between variables, clusters, understanding of category structure.
Accuracy is of secondary importance as long as it is not very bad. Fashionable methods like SVMs may provide some knowledge if small number of support vectors is left and they are carefully analyzed, or if SVM is used with different subsets of features etc, but we should not be happy with a high accuracy black box that does not explain its decisions.
The deadline for these assignments is 29.06.
Electronic copy is sufficient: please zip or compress all files, give the file your name and send it to me.
Try to minimize the size of the file, I want to attach them to this page and I do not have much space in WWW!
Q & A:
1. Is a formal report required? If so, how long should it be?
Your paper is a report. The length should be sufficient for others that have taken this course to be able to understand the method and interpret the results.
2. As visualization is clearer on screen, should we design a webpage instead of writing a report?
Web pages are OK, we may put them into the e-dventure space, but please send me the files.
3. What must we do to score high marks for this assignment? Do we have to study CI techniques in much greater depth than what was covered in the lectures?
Find an interesting method, perhaps new variant of one of the methods I talked about,
find interesting data,
provide interpretation using intersting method of anlysis, including visualization,
describe what can we learn from it,
mention potential applications.
Topics taken in the past.
- Analysis of Cervical Cancer data
- SVM for Pima Indians Diabetes
- Decision tree/SVM for adult database
- kNN/SVM for Wine recognition database
- Analysis of Image Segmentation data using C4.5
- Decision Tables for Soybean Data Analysis
- Ensemble Classification of Vowel Data
- Data Analysis with Transductive Support Vector Machine
- Parzen Window Classifer for Wisconsin Prognostic Breast Cancer (WPBC)
- SVM for Image Segmentation
- Credit approval using SSV trees
- CART for heart disease
- Efficient C4.5 for Adult database
- Abalone Age Prediction Using Bayesian Belief Networks
- Bayesian analysis of football scores
- SVM analysis for intrusion detection
- Face recognition
- SVM for Spam Filtering
- kNN+Decision Tree for forensic glass identification.
- Parzen windows / Probabilistic neural networks (PNNs) for Ionosphere database
- What makes your H6429 assignments score high - Feature selection
- Teaching Assistant Evaluation with fuzzy NN
- SVM for UCI heart disease
- Reduced Coulomb Energy (RCE) Networks
- SVM for regression on Abalone/Auto-mpg
- Soccer player classification with decision trees
- Decision tree for voting records
- Filter Model Feature Selection Algorithms
- Analysis on internet advertisements dataset using SVM
- Learning post-operative patient care system using IBM
- Data Analysis with the marriage of spectral kernel and SVM
- K Nearest Neighbor method for animal classification
- A Rule-based Approach for Classifying Census Dataset
- On estimating the parameters of Gaussian mixtures using EM
- NLPCA with Principal Curves on the Application of Ice Floe Identification in Satellite Images.
- Glass Identification with Kernel PCA.
- PCA on breast cancer detection system.
- Learning Object Shapes Distribution Based on Kernel PCA.
- PCA on face recognition and hand posture recognition.
PCA | Matlab | UMIST face database.
- MDS and ISOMAP visualization of Brodatz Texture data.
- Locality Preserving Projections for images/expression data.
- Visualization of support vector regression machine.
- Comparison of linear discriminants in data analysis.
- Demixing data using ICA.
- Multidimensional Scaling on soybean data.
- PCA and LDA on wine data.
- Visualization of Gene Expression Data by SVD and Kernel PCA.
- ICA Infomax approach for speech.
- Cluster visualization using Laplacian eigenmaps.
- Visualization using PCA and Kernel PCA.
- SVM kernel based visualization and classification.
- Growing Cell Structures for balance scale data.
- PCA for letters.
- Magnetic Recording Media visualization.
- Generative Topographic Mapping (GTM) on Ionosphere dataset.
- Emergent SOM on medical data.
- Kernel Fisher discriminate analysis on very sparse text data.
- Regression Tree to Predict The Boston House Values.
- Glass Identification using CART.
- Possibilistic clustering for neural based breast cancer detection in thermograph.
- MLP analysis and evaluation as approximation method.
- SVM for texture classification.
- EM Algorithm for Background Modelling of Image Sequences.
- AdaBoost for classification and feautre selection.
- Event data analysis and event detection for anticipatory news events with SVM classification.
- Basketball Player Characteristics Analysis Using Decision Tree.
- Data analysis of Tic-Tac-Toe using decision tree.
- Data mining on the Wisconsin Breast Cancer.
- Feature selection for microarray data.
- Credit Scoring Model Using Quick, Unbiased and Efficient Statistical Tree.
- Understanding Amino Acids - Types + Sequences.
- FCMAC-AARS: A brain-inspired fuzzy neural architecture for data mining and knowledge discovery.
- CART/DTREG Analysis of Diabetes data.
- Gene Expression Data Analysis by Decision Tree and AANN.
- Classification of animals using the C4.5 decision tree.
- Ada-boosting for rectangular face feature selection.
- Heart Disease Diagnosis using SSV Decision Tree.
- Decision Tree analysis of the 1984 United States Congressional Voting Records.
- Wrapper feature selection approaches: class dependent or class independent?
- Intelligent Learning by Decision Trees.
A few interesting papers with novel methods are below:
SVM - kernel visualization
Wizualizacja sieci neuronowych z ICNNSC |
Wizualizacja sieci neuronowych z IJCNN |
|