Investigative Data Mining for Security and Criminal Detection
Chapter 1: Precrime Data Mining
- Figure 1.1: A link analysis can organize views of criminal associations.
- Figure 1.2: Software agents can autonomously monitor events.
- Figure 1.3: Text mining can extract the core content from millions of records.
- Figure 1.4: A neural net can be trained to detect criminal behavior.
- Figure 1.5: CATCH— Computer Aided Tracking and Characterization of Homicides.
- Figure 1.6: September 11, Boston to New York, 8—30AM.
- Figure 1.7: Illustrative example of the encoding of height as zero or one.
- Figure 1.8: Derived cluster sizes.
- Figure 1.9: Symbolic descriptions of clusters.
- Figure 1.10: Dendrogram for hierarchical agglomerative clustering of SOM cluster centres.
- Figure 1.11: SOM map following merging of spatially near neighbors.
Chapter 2: Investigative Data Warehousing
- Figure 2.1: Sample record extract (criminal record detail).
- Figure 2.2: The iManageData interface.
Chapter 3: Link Analysis: Visualizing Associations
- Figure 3.1: A financial link analysis network.
- Figure 3.2: A NETMAP link chart displaying financial relationships.
- Figure 3.3: The Link Notebook supports zoom in features.
- Figure 3.4: A timeline displaying time-related events.
- Figure 3.5: Confirmed links are shown as solid lines.
- Figure 3.6: Unconfirmed associations are dashed lines.
- Figure 3.7: Members of an organization are grouped inside a box.
- Figure 3.8: An organization can be aggregated as an entity.
- Figure 3.9: The central contact is unknown.
- Figure 3.10: Here Entity 1 is ID.
- Figure 3.11: The links are the intelligence.
- Figure 3.12: A sample of a chart with a legend.
- Figure 3.13: A telephone toll analysis chart.
- Figure 3.14: Voluminous amounts of data can lead to vague charts.
- Figure 3.15: An analyst can move events and change the chart as needed.
- Figure 3.16: Events are placed on the theme they relate to.
- Figure 3.17: Several events can also be combined.
- Figure 3.18: Multiple events and transactions can be mapped.
- Figure 3.19: The association matrix in Crime Link.
- Figure 3.20: From the matrix Crime Link generates its diagrams.
- Figure 3.21: A Daisy chart showing a date and time analysis.
- Figure 3.22: The formats supported by NETMAP.
- Figure 3.23: This chart shows the link between the nodes at both ends.
- Figure 3.24: An ORIONLink sample diagram.
Chapter 4: Intelligent Agents: Software Detectives
- Figure 4.1: Bio-terrorism system using agents with sensors.
- Figure 4.2: Agent system would serve to provide early detection.
- Figure 4.3: Agentland.com provides agent software for downloading.
- Figure 4.4: A menu of development agent software available.
- Figure 4.5: The completed agent form.
- Figure 4.6: A list is generated with scores of relevance associated with them.
Chapter 5: Text Mining: Clustering Concepts
- Figure 5.1: Topics derived from clustering 60,000 news reports.
- Figure 5.2: An 86-word summary of the news stories.
- Figure 5.3: WordStat univariate word-frequency analysis.
- Figure 5.4: ClearForest taxonomy graphical view of an individual.
- Figure 5.5: Dynamic view of relationships.
- Figure 5.6: TextRoller summary results.
- Figure 5.7: A Leximancer concept map of 155 Internet news groups.
- Figure 5.8: TripleHop's three-layer architecture.
- Figure 5.9: The VisualText GUI interface.
Chapter 6: Neural Networks: Classifying Patterns
- Figure 6.1: This is the suspect the police are searching for.
- Figure 6.2: Attrasoft ImageFinder during training.
- Figure 6.3: System recognized the suspect wearing a hat.
- Figure 6.4: System recognized suspect with a beard.
- Figure 6.5: This is how the data looks in our Border Profile database.
- Figure 6.6: The different colors represent different stages of alerts.
- Figure 6.7: The cluster of arrests can be marked and exported to a file.
- Figure 6.8: Example given to the neural network. The C-12 denotes the position of dodecane.
- Figure 6.9: One of the two matches found by the neural network. The C-12 denotes the position of dodecane.
- Figure 6.10: A second match found by the neural network. The C-12 denotes the position of dodecane.
- Figure 6.11: The closest non-match found by the neural network. The C-12 denotes the position of dodecane.
- Figure 6.12: The CRISPDM methodology.
- Figure 6.13: Primary network of offenders.
- Figure 6.14: Distance chart.
- Figure 6.15: Crimes by time of day.
- Figure 6.16: Crimes by day of week.
- Figure 6.17: Spatial analysis.
- Figure 6.18: Schematic data flow.
- Figure 6.19: Panes allow the user to visualize the network training results.
- Figure 6.20: Training to recognize the number 5.
Chapter 7: Machine Learning: Developing Profiles
- Figure 7.1: Decision tree used to predict probability of smuggling by make of auto.
- Figure 7.2: The Anti-Drug Network (ADNET).
- Figure 7.3: The ADNET control center.
- Figure 7.4: Eleven sets of training, testing, validation data (33 sets in all).
- Figure 7.5: The data was rotated in the training, testing, and validation phases.
- Figure 7.6: Five algorithms on six data sets yielded different results.
- Figure 7.7: Model ensembles make decisions by committee of algorithms.
- Figure 7.8: Data is prepared for mining.
- Figure 7.9: Model creation stream in Clementine.
- Figure 7.10: Results of final models.
- Figure 7.11: Overall model score on validation data.
- Figure 7.12: Alice decision tree interface.
- Figure 7.13: Alice d'Isoft 6.0 decision tree output.
- Figure 7.14: Business Miner decision tree interface.
- Figure 7.15: This is the CART interface for model setup.
- Figure 7.16: The CART binary trees.
- Figure 7.17: Lift charts for each class from the decision trees can be viewed.
- Figure 7.18: This instrument displays the variables of most importance.
- Figure 7.19: The rates of prediction for training and testing classes can be viewed.
- Figure 7.20: Sample of CART rules.
- Figure 7.21: SuperQuery IF/THEN dialog box.
- Figure 7.22: Alert is the field from which rules will be generated.
- Figure 7.23: This dialog box in WizWhy allows for the setting of rule parameters.
- Figure 7.24: This is rule 6, from a total of 214 rules. Note the conditions for a high alert.
- Figure 7.25: Decision trees can be split on any desired variable in the database.
- Figure 7.26: Decision tree split on the basis of vehicle make.
- Figure 7.27: Multiple analyses can be performed by inserting them via a drop window.
- Figure 7.28: Note the improved performance at 70% of population.
- Figure 7.29: Rules can be produced in various formats.
- Figure 7.30: Partial view of rules generated in Java from this tool.
- Figure 7.31: The Neural Net Wizard interface.
- Figure 7.32: This is PolyAnalyst's main window.
- Figure 7.33: This is the data import wizard interface.
- Figure 7.34: The Visual Rule Assistant simplifies rule generation.
- Figure 7.35: Decision tree interface with summary statistic window.
- Figure 7.36: A schematic decision tree.
- Figure 7.37: Decisionhouse graphical displays.
- Figure 7.38: Enterprise Miner's SEMMA process.
- Figure 7.39: Clementine uses icons to perform data mining analyses.
- Figure 7.40: NCR's Data Mining Method and Teradata Warehouse Miner Technolgoy.
Chapter 8: NetFraud: A Case Study
- Figure 8.1: Associations between products and fraud. Note the bold line between hardware/software and fraud.
- Figure 8.2: A clustering map where light shades are legal and dark areas are fraudulent transactions.
- Figure 8.3: We mark the section of fraudulent transactions.
- Figure 8.4: Camcorders with an average price of $1,052 are a major target for fraud.
- Figure 8.5: The error rate is only about 8% for this neural-network model.
- Figure 8.6: This sensitivity instrument prioritizes the inputs for a fraud model.
- Figure 8.7: A view of the training of the perceptron neural network.
- Figure 8.8: Decision trees can uncover hidden ranges where fraud is higher than average.
- Figure 8.9: As fraud statistics show, computer equipment is high on criminals' lists.
- Figure 8.10: Fraud is highest in households where the median rent is $425-$548.
Chapter 9: Criminal Patterns: Detection Techniques
- Figure 9.1: The CRISP-DM process.
Chapter 10: Intrusion Detection: Techniques and Systems
- Figure 10.1: Thirty-day summary of File Transfer Protocol connections.
- Figure 10.2: An IDS is only part of the entire deterrence process.
Chapter 11: The Entity Validation System (EVS): A Conceptual Architecture
- Figure 11.1: Incremental profiles are distributed.
Chapter 12: Mapping Crime: Clustering Case Work
- Figure 12.1: MAPS links to crime maps and statistics to various cities.
- Figure 12.2: A view of crimes by types in the central part of the city.
- Figure 12.3: San Diego interactive crime map.
- Figure 12.4: Approach and verbal-themes behavior.
- Figure 12.5: Approach and precautions behavior.
- Figure 12.6: The SOM represents about 5,000 murders in the HITS database.
- Figure 12.7: Crimes are mapped by modus operandi descriptions.
- Figure 12.8: Order and description of crimes such as rape, serial and rituals can be queried.
- Figure 12.9: The figure shows all the crime data vectors as points in a three-dimensional eigenspace.
- Figure 12.10: Crimes can be mapped along highways.
- Figure 12.11: Similarity of crimes can be viewed and measured via a grid.
- Figure 12.12: Comparison of crime types can be measured.
- Figure 12.13: Probability and distance of crimes by the same perpetrator can be graphed.
- Figure 12.14: The solid line in the graph shows the probability of finding two sexual assaults by one serial rapist n number of cells apart.
Категории