Loading...

Social network analysis in the field of criminology

A criminal network analysis

Master's Thesis 2015 51 Pages

Engineering - Computer Engineering

Excerpt

Table of Contents

List of Figures

List of Tables

List of Equations

Acknowledgement

Abstract

1. Introduction
1.1. Social Network Analysis
1.2. Characteristics of Social Networks
1.3. Motivation of Social Network Analysis in the field of Criminology
1.4. How Social Network Analysis can be implied to Criminology?
1.5. Objectives

2. Literature Survey

3. Common Terms and Concepts
3.1. Data Mining
3.2. WEKA
3.3. Gephi
3.4. Community Analysis
3.4.1. Member Based Community Detection
3.4.2. Node Similarity
3.5. Recommendation in Social media
3.5.1. Collaborative Filtering
3.5.2. Model based Collaborative Filtering
3.5.3. Content Based Method
3.6. Vector Space Model
3.7. Basic Concepts of Information Theory
3.8. Betweenness Centrality
3.9. What is Classification?
3.10. Decision Tree Induction
3.11. Herd Behavior
3.12. Collective Behavior

4. Find Communities Depending on Criminal Incidents & Recommend Result to Users
4.1. Objectives
4.2. Data Model
4.3. Proposed Model
4.4. Result Analysis
4.5. Summary

5. Detect Criminal Network & Group Activity Using Collaborative Filtering
5.1. Objectives
5.2. Proposed Model
5.3. Result Analysis
5.4. Summary

6. A Statistical Model to Determine the Behavior Adoption in Victims in Different Timestamps
6.1. Objectives
6.2. Survey on Facebook to Collect Data
6.3. Proposed Procedure
6.4. Result Analysis
6.5. Summary

7. Conclusion & Future Scope

8. References

List of Figures

Figure 3.1.Process of Data Mining

Figure 3.2.Decision Tree Generation

Figure 4.1.Model 1 representation of the process

Figure 4.2.Column chart representation of crime record

Figure 4.3.Decision tree generation in WEKA

Figure 4.4.Age, Gender and Social Status representation

Figure 4.5.Nodes representation of criminal incidents

Figure 4.6.Edge details of criminal incidents

Figure 5.1.Model 2 representation of the process

Figure 5.2.Suspect’s net[i]work

Figure 5.3.Betweenness centrality representation

Figure 5.4.2-D plot representation of suspects & criminal incidents

Figure 5.5.2-D plot representation of Suspects

Figure 5.6.Suspect’s connectivity

Figure 5.7.preferable crime activity by a group

Figure 5.8.Most preferable crime activity by a group

Figure 6.1.Snapshot of survey page

Figure 6.2.IDF weighting vs. Doc Frequency

Figure 6.3.Occuranve rate of each term

Figure 6.4.Term growth in different timestamps

Figure 6.5.Percentage decrement of Annoyed & Worried at t2 over t

Figure 6.6.Chances to be selected {1, 3, 5} over {1, 2, 5}

List of Tables

Table 3.1.Matrix presentation of query

Table 3.2.Matrix presentation using tf-idf values

Table 4.1.Data structure represents information given by victims

Table 4.2.Group representation of criminal incidents

Table 4.3.Vectorization process (1)

Table 4.4.Frequency-idf values

Table 4.6.User Details

Table 4.5.Vectorization process (2)

Table 4.7.User tf-idf details

Table 4.8.Similarity Calculation

Table 5.1.Suspect’s ratings

Table 5.2.Suspect’s rating for group

Table 5.3.Suspect’s rating for group

Table 5.4.Suspect’s rating for group

Table 5.5.Suspect’s rating for behavior analysis

Table5.6.Decomposed value of Suspect’s rating

Table 5.7.Diagonal Matrix

Table 5.8.Decomposed value for crime incidents ratings

Table 6.1.Example of term occurrences

List of Equations

Equation 3.1 Measure of vertex similarity

Equation 3.2 Jaccard similarity

Equation 3.3 Cosine Similarity

Equation 3.4 Cosine Similarity between Two Vectors

Equation 3.5 Information Measure

Equation 3.6 Compute the Numbers of Shortest Paths

Equation 3.7 Betweenness Centrality

Equation 3.8 Normalized Betweenness Centrality

Acknowledgement

I am sincerely and heartily grateful to my advisors, Dr. Dipak Kumar Kole and Dr. Chandan Giri for the support and guidance they showed me throughout my thesis work. Without their guidance and persistent help this thesis work would not have been possible. I wish to express my sincere thanks to our Director sir Dr. Arindam Biswas. I am also thankful to Dean for providing an excellent environment for completion of this project. I would like to thank the faculty members of Information Technology department of INDIAN INSTITUTE OF ENGINEERING SCIENCE AND TECHNOLOGY, SHIBPUR, for providing me the opportunity to work under their supervision and enriched my knowledge. Besides I would like to thank to my parents and my classmates who boosted me morally and provided me great information resources and specially Mr. Dhrubasish Sarkar, (Assistant Professor, Amity University, Kolkata) who is becoming the pioneer for me in this journey.

ABSTRACT

In this 21st century the technological advancement in computational areas has already put its footsteps in every fields of science, sociology, health, business and in many others. Now the people are connected across the world using internets and more precisely we can say that the communication is done freely through social networks. From that point of view researchers try to analyze the social networks for the betterment of different domains or trying to create something useful using the social networks. Now we will try to use this ground to help the victims and use their information’s to stop the upcoming criminal activities by analyzing those information. It will help the common people to know how and where antisocial activities are being held. Here WEKA which implements data mining algorithms is a state-of-the-art facility for developing machine learning (ML) techniques and their application to real-world data mining problems will be used to analyze the collected data. After data collection, based on different types of criminal incidents some communities will be formed. These communities are helpful to detect real world gang or group activities related to crimes. Recommendation in social media is useful concept to aware people and fight against antisocial activities. Model based collaborative filtering which is a type of recommendation in social media will be used to find common criminal behavior towards different criminal activities. Concepts of herd behavior and collective behavior will also be used to find victims adoption of behaviors in different timestamps from initial timestamp.

Keywords:

Data Mining, Data Analysis, Social Network Analysis, WEKA, Community Detection, Recommendation in Social Media, Centrality, Single Valued Decomposition, Herd Behavior, Collective Behavior

Chapter 1

Introduction

1.1. Social Network Analysis

Social Networks are the results of successful journey of the Internet. Internet, in this 21st century has revolutionized the computer world and electronics media. The mapping and measuring of relationships and flows between people, organizations and computers is the base of social network analysis. Social network analysis is a method by which one can measure or analyze the connection between individuals or groups. The advantage of social network analysis is that, unlike other methods, it only focuses on the interaction, rather than the individual behavior. It can be implied across the disciplines, social networks, political networks etc. Much early research in network analysis is found in educational psychology, and studies of child development. Network analysis also developed in fields such as sociology and anthropology. In the 19th century, Durkheim wrote of “social facts”—or phenomena that are created by the interactions of individuals yet constitute a reality that is independent of any individual actor [1]. At the turn of the 20th century, Simmel was one of the first scholars to think in relatively explicit social network terms. He examined how third parties could affect the relationship between two individuals—and he examined how organizational structures were needed to coordinate interactions in large groups. After the 1950s, networks were less evident in social psychology and more evident in sociology (particularly economic sociology), and in anthropology. Developments in the last few decades include much attention paid to several concepts, including “the strength of weak ties”, and “small World” [2]. The idea of “Small Worlds” and “The Strength of Weak Ties” completely based on graphs. A graph or network is a set of unit that may be connected to each other. So our soul concern on this research will be related to the different characteristics of a graph theory.

1.2. Characteristics of Social Networks

Most people use social networks to socialize, exchange information, thoughts and ideas, however, it can also be used for some anti-social activities. First Social networking site, Myspace was developed by Tom Anderson and Chris De Wolfe in July 2003 [3]. But in 2008 Myspace was overtaken by its competitor Facebook. Facebook was founded in 2004 by Mark Zuckerberg, Eduardo Saverin, Dustin Moskovitz and Chris Huges. The main characteristics of a social networking site for which it’s been popular are summarizing below.

- User Based: Users can submit and organize information. Direction of content can be determined by any user – no one person dictates the current topic. Users have Freeform in unstructured manner. Members hold common beliefs or interests. User can make new friends with people who say they share interests or beliefs.
- Interactive: Here users will not only get the chatrooms, but also can play the games, quiz puzzles etc. It is a way which connects the friends and professional under a same roof.
- Relationships: Social networks are driven by the number of relationships between its members, and information will be dispersed to friends, their friends, and so on1.

1.3. Motivation of Social Network Analysis in the field of Criminology

Criminology is the branch of science which deals with the criminal activities of different types. In present computer world, World Wide Web is a way by which criminal activities can be done, and we are aware of this type of criminal activities. Online transactions are very much popular now a day, for the active presence of e-commerce. So, different types of hacking activities including terrorist activities can be noticed. Previously analysis of the terrorism activities from a particular social network, called twitter has been considered using different nlp techniques in a special work [4], which is similar to us. In social networks different kinds of people can be found and easy to communicate with them, and easy to share the thoughts or make relationships with them. Many fraud activities were reported in past several years by the users of social networks. So, social networks can be analyzed to decrease the criminal activities.

1.4. How Social Network Analysis can be implied to criminology?

Criminology is the social problem; People rob or steal from other people. Murders most often occur between friends or acquaintances. Domestic violence takes place within a set of familiar relationships that is highly contextualized and outside of age and gender, the murder of delinquent friends. One has is perhaps one of criminology’s most robust predictors of crime.

Thinking of the world in networked terms then does not require a departure from current criminological thinking. Instead, it requires us to take stock of our discipline and engage in a candid discussion about how formal network methodologies might inform criminology and, conversely, how criminology might inform social network analysis. To do so, one must consider how the things criminologists care about-and measure, can be thought of in terms of relationships among actors. For example the study of group process and group influence can move past counting the number of delinquent friends one has and begins to measure the content and form of one’s social network in more tangible and quantifiable ways; the composition of a network, a tendency towards homophile, a particular patterning of social ties, or the propensity to form or avoid ties with particular types of individuals [5].

Unlike other types of crimes often committed by single or a few offenders, organized crimes are carried out by multiple collaborating offenders, who may form groups and teams and play different roles. In a narcotics network, for instance, different groups may be responsible for handling the drug supply, distribution, sales, smuggling, and money laundering. In each group, there may be leader who issues commands and provides steering mechanisms to the group, as well as gatekeepers who ensure that information and drugs flow effectively to and from other groups. Criminal network analysis is therefore requires the ability to integrate information from multiple crime incidents or even multiple sources and discover regular patterns about the structure, organization, operation, and information in criminal networks [6].

1.5. Objectives

In social networks each users can be treated as nodes. The communication cycles between nodes are treated as edges and it will create a graph. Edges are divided into two groups, incoming and outgoing, which will be treated as in-degree and out-degree. So the information is traveling from one node to another through a network or through the social network. The concepts of Social Network Analysis will be used in criminology to help victims and detect criminal networks.

The objectives are listed below;

1) Collect Data of victims related to different types of criminal activities.
2) Create groups of Criminal Incidences (Based on Member Based Community Detection – Node Similarity, here node is incidence).
3) Apply Content Based Filtering Method to find out the Similarity between the crime incident description and user’s Profile information (with or without location information)
To do so, first represent both user profile and criminal incidence by vector space model and then apply cosine similarity and recommend the users.
4) Detect a criminal network and rate the suspects (collaborative filtering). Then find the linked persons who are valuable in the network (betweenness centrality).
5) Detect common behaviors that can be observed from the criminals to different crime incidents and detect gang activities.
6) Use a statistical model to determine the behavior adoption among victims in different timestamps over initial timestamp.

The rest of the report is organized as follows. In chapter 2 related prior works are described. In chapter 3 some common terms and concepts are briefly explained. Chapter 4 discusses a model to find communities depending on criminal incidents & recommendation process for users with experimental results. In chapter 5, a model to detect criminal networks & group activity using collaborative filtering is described with experimental results. Chapter 6 explains a statistical model to determine the behavior adoption of victims in different timestamps. Finally chapter 7 concludes this report and shows the scope for future research.

Chapter 2

Literature Survey

The journey of internet started in the year of 1969 when ARPANET successfully communicate some pieces of data between two nodes. In the initial stages it was only for the research purposes and was used by the US military for defense purpose. But in 1989, Tim Berners Lee, a researcher of CERN proposed a new idea, which in current days known as WWW or World Wide Web [7]. It is an international system of protocols, building distributed hypermedia servers which allow users to create electronic documents which will point to many different files of potentially different types across the world. He created first browser editor and communicating software, defining URLs (Uniform Resource Locator), HTTP (Hypertext Transfer Protocol), and HTML (Hypertext Markup Language) [8].

In present days Internet is becoming one the most important aspects of our society. The life without Internet is not considerable because of the changed lifestyle in the human society. In every fields, whatever it is, from birth to death, all the information are being recorded into the computer or the system is trying to be computerized. The large amounts of data which must be stored centrally, require a database to keep the record safely. The concept of central database can be achieved only by using networks or more precisely we can say through Internet or networked oriented approach. So, Internet is acceptable to the society for its speed to communicate and accuracy. In last few years social networks are becoming popular not only among the youths, also in the people of every age.

Few researchers already worked in the field of data mining that how it can be applied in the field of criminology. S.Yamuna and N.Sudha Bhuvaneswari published a paper named as “Datamining Techniques to Analyze and Predict Crimes” where they proposed clustering technique and classification technique to analyze the crime and to detect and predict the crimes [9]. Shyam Varan Nath explained how clustering technique can be useful in crime pattern detection in his paper named as “Crime Pattern Detection Using Data Mining” [10].

There are some criminologist who worked on this particular discipline named Andrew V . Papachristos on ” The Coming of a Networked Criminology”, Jerzy Sarnecki on ” Social network analysis and criminology”, Noora Al Mutawa, Ibrahim Baggili, Andrew Marrington on “Forensic analysis of social networking applications on mobile devices”, A. Karran, J. Haggerty, D. Lamb, M. Taylor and D. Llewellyn-Jones on “A Social Network Discovery Model for Digital Forensics Investigations” etc [11].

In India NCRB (National Crime Records Bureau)has the proud distinction of installing 762 server - based computer systems at every District Crime Records Bureau and State Crime Records Bureau, across the country, 'Crime Criminal Information system (CCIS)’ project, with a view to maintain a National - level Database of Crimes, Criminals and Property related to crime. NCRB is implementing Crime and Criminal Tracking Network and Systems (CCTNS), which is a mission mode project under the National e-governance plan of Government of India. Cabinet Committee on Economic Affairs (CCEA) approved CCTNS Project on 19.06.2009 and an allocation of Rs. 2000 cr has been made.

Chapter 3

Common Terms and Concepts

3.1. Data Mining

Data mining is the process of discovering interesting knowledge from large amounts of data stored either in databases, data warehouses or other information repositories. Data mining often used as a synonym for KDD (Knowledge Discovery from Database), but it is an essential step in the entire process which uncovers hidden patterns for evaluation [12]. Process of data mining is shown in figure 3.1.

Abbildung in dieser Leseprobe nicht enthalten

Figure 3.1.Process of Data Mining

3.2. WEKA

WEKA is a data mining system developed by the University of Waikato in New Zealand. That implements data mining algorithms. WEKA is a state-of-the-art facility for developing machine learning (ML) techniques and their application to real-world data mining problems. It is a collection of machine learning algorithms for data mining tasks. The algorithms are applied directly to a dataset. WEKA implements algorithms for data preprocessing, classification, regression, clustering, association rules; it also includes a visualization tools. The new machine learning schemes can also be developed with this package. WEKA is open source software issued under the GNU General Public License [13].

3.3. Gephi

Gephi2 is an interactive visualization and exploration platform for all kind of networks and complex systems, dynamic and hierarchical graphs. Gephi is open source network analysis and visualization software written in java on the NetBeans platform, initially developed by students of the University of Technology of Compiegne in France. The applications of this software are in many fields like exploratory data analysis, link analysis, social network analysis, biological network analysis, poster creation etc.

3.4. Community Analysis

Also known as groups, clusters, or cohesive subgroups, communities have been studied extensively in many fields and, in particular, the social sciences. In social media mining, analyzing communities is essential. Studying communities in social media is important for many reasons. First, individuals often form groups based on their interests, and when studying individuals, we are interested in identifying these groups. Consider the importance of finding groups with similar reading tastes by an online book seller for recommendation purposes. Second, groups provide a clear global view of user interactions, whereas a local-view of individual behavior is often noisy and ad hoc. Finally, some behaviors are only observable in a group setting and not on an individual level. This is because the individual’s behavior can fluctuate, but group collective behavior is more robust to change [14].

3.4.1. Member Based Community Detection

The intuition behind member-based community detection is that members with the same (or similar) characteristics are more often in the same community. Therefore, a community detection algorithm following this approach should assign members with similar characteristics to the same community in theory; any sub graph can be searched for and assumed to be a community. In practice, only sub graphs that have nodes with specific characteristics are considered as communities. Three general node characteristics that are frequently used are node similarity, node degree (familiarity), and node reach ability [15].

3.4.2. Node Similarity

Node similarity attempts to determine the similarity between two nodes vi and vj. Similar nodes (or most similar nodes) are assumed to be in the same community. Determining similarity between two nodes has been addressed in different fields; in particular, the problem of structural equivalence in the field of sociology considers the same problem. In structural equivalence, similarity is based on the overlap between the neighborhoods of the vertices.

Let N(vi) and N(vj) be the neighbors of vertices vi and vj, respectively. In this case, a measure of vertex similarity (equation 3.1) can be defined as follows [15]

Abbildung in dieser Leseprobe nicht enthalten

For large networks, this value can increase rapidly, because nodes may share many neighbors. Generally, similarity is attributed to a value that is bounded and is usually in the range [0:1]. Various normalization procedures can take place such as the Jaccard similarity (equation 3.2) or the cosine similarity (equation 3.3):

Abbildung in dieser Leseprobe nicht enthalten

3.5. Recommendation in Social media

Recommender systems are commonly used for product recommendation. Their goal is to recommend products that would be interesting to individuals. Formally, a recommendation algorithm takes a set of users U and a set of items I and learns a function f such that;

f: U X I →IR

In other words, the algorithm learns a function that assigns a real value to each user-item pair (u, i), where this value indicates how interested user u is in item i. This value denotes the rating given by user u to item i. The recommendation algorithm is not limited to item recommendation and can be generalized to recommending people and material, such as, ads or content [15].

3.5.1. Collaborative Filtering

The input to the collaborative filtering algorithm is a mxn matrix where rows are items and columns are users Sort of like term-document matrix (items are terms and documents are users).We can think of items as vectors in the space of users (or users as vectors in the space of items).Can do scalar clusters etc. Weight all users with respect to similarity with the active user. Select a subset of the users (neighbors) to use as predictors. Normalize ratings and compute a prediction from a weighted combination of the selected neighbors’ ratings. Present items with highest predicted ratings as recommendations [15].

3.5.2. Model based Collaborative Filtering

In model based collaborative filtering, one assumes that an underlying model governs the way user’s rate. Among a variety of model-based techniques, we focus on a well-established model-based technique that is based on singular value decomposition (SVD) [16].

Singular Value Decomposition:

SVD is a linear algebra technique that, given a real matrix X є Rmxn, m ≥ n, factorizes it into three matrices, as X = UZVT Where U є Rmxm and V є Rnxn are orthogonal matrices and Z є Rmxn is a diagonal matrix. The product of these matrices is equivalent to the original matrix; therefore, no information is lost. Hence, the process is lossless.

Aggregation Strategies for a Group of Individuals [15]: Maximizing Average Satisfaction :

The products that satisfy each member of the group on average, are the best to be recommended to the group.

Most Pleasure: The most pleasure approach, we take the maximum rating in the group as the group rating.

3.5.3. Content Based Method

Algorithm Content -based similarity

Require: User i’s Profile Information, Item descriptions for items j є {1, 2, …., n}, k keywords, r number of recommendations.

1: return r recommended items.
2: Ui = (u1, u2; . , uk) = user i’s profile vector;
3: {Ij}nj=1 = {(ij,1, ij,2,., ij,k) = item j’s description vector}nj=1 ;
4: si,j = sim(Ui,Ij), 1 ≤ j ≤ n;
5: Return top r items with maximum similarity si,j.

Content-based recommendation systems are based on the fact that a user’s interest should match the description of the items that are recommended by the system [15]. In other words, the more similar the item’s description to the user’s interest, the higher the likelihood that the user is going to find the item’s recommendation interesting. Content-based recommender systems implement this idea by measuring the similarity between an item’s description and the user’s profile information. The higher similarity value will have the higher chance that the item to be recommended.

To formalize a content-based method, we first represent both user profiles and item descriptions by vectorizing them using a set of k keywords. After vectorization, item j can be represented as a k-dimensional vector Ij = (ij,1, ij,2,, ij,k) and user i as Ui = (ui,1; ui,2,, ui,k).

To compute the similarity between user i and item j (equation 3.4), we can use cosine similarity between the [17] two vectors Ui and Ij.

Sim(Ui,Ij)= Cos(Ui,Ij) = (3.4)

3.6. Vector Space Model

The Vector Space Model (VSM) is a standard technique in Information Retrieval in which documents are represented through the words that they contain. It was developed by Gerard Salton in the early 1960's to avoid some of the information retrieval problems. Vector spaces models convert texts into matrices and vectors, and then employ matrix analysis techniques to find the relations and key features in the document collection. The process of vector space model will be briefly explained with an Example. Examples of vector space model are available in many websites. Here is a simplified example of the vector space retrieval model. Consider a very small collection C with the following three documents [22, 23]:

D1: “new york times”

D2: “new york post”

D3: “los angeles times”

Some terms appear in two documents, some appear only in one document. The total number of documents is N =3. Therefore, the idf values for the terms are:

angeles log2 (3/1)=1.584

los log2 (3/1)=1.584

new log2 (3/2)=0.584

post log2 (3/1)=1.584

times log2 (3/2)=0.584

york log2 (3/2)=0.584

For all the documents, we calculate the tf scores for all the terms in C. We assume the words in the vectors are ordered alphabetically. Process is explained in table 3.1.

Abbildung in dieser Leseprobe nicht enthalten

Table 3.1.Matrix presentation of query

Now we multiply the tf scores by the idf values of each term (table 3.2), obtaining the following matrix of documents-by-terms: (All the terms appeared only once in each document in our small collection, so the maximum value for normalization is 1.)

Abbildung in dieser Leseprobe nicht enthalten

Table 3.2.Matrix presentation using tf-idf values

Given the following query: “new new times”, we calculate the tf-idf vector for the query, and compute the score of each document in C relative to this query, using the cosine similarity measure. When computing the tf-idf values for the query terms we divide the frequency by the maximum frequency (2) and multiply with the idf values.

Abbildung in dieser Leseprobe nicht enthalten

We calculate the length of each document and of the query:

Length of D1 = sqrt(0.584[2]+0.584[2]+0.584[2])=1.011

Length of D2 = sqrt(0.584[2]+1.584[2]+0.584[2])=1.786

Length of D3 = sqrt(1.584[2]+1.584[2]+0.584[2])=2.316

Length of q = sqrt(0.584[2]+0.292[2])=0.652

Then the similarity values are:

cosSim(D1,q) = (0*0+0*0+0.584*0.584+0*0+0.584*0.292+0.584*0) / (1.011*0.652) = 0.776

cosSim(D2,q) = (0*0+0*0+0.584*0.584+1.584*0+0*0.292+0.584*0) / (1.786*0.652) = 0.292

cosSim(D3,q) = (1.584*0+1.584*0+0*0.584+0*0+0.584*0.292+0*0) / (2.316*0.652) = 0.112

According to the similarity values, the final order in which the documents are presented as result to the query will be: D1, D2, and D3.

3.7. Basic Concepts of Information Theory

Information α (1/Probability)3

Information = F (1/Probability)

Requirement that function must satisfy [18]

1. Its output must be non-negative Quantity.
2. Minimum Value is 0.
3. It should make Product into summation.

Here b may be 2, e or 10.

If b = 2 then unit is bits,

b = e then unit is nats, and

b = 10 then unit is decit.

3.8. Betweenness Centrality

Another way of looking at centrality is by considering how important nodes are in connecting other nodes. One approach, for a node vi, is to compute the number of shortest paths between other nodes that pass through vi. Consider the equation 3.6;

Abbildung in dieser Leseprobe nicht enthalten

Where σst is the number of shortest paths from node s to t (also known as information pathways), and σst(vi) is the number of shortest paths from s to t that pass through vi. In other words, we are measuring how central vi’s role is in connecting any pair of nodes s and t. This measure is called betweenness centrality.

Betweenness centrality is defined by (equation 3.7):

Abbildung in dieser Leseprobe nicht enthalten (3.7)

Where g jk = the number of shortest paths connecting jk and g jk(i) = the number that actor i is on.

Usually normalized by (equation 3.8):

Abbildung in dieser Leseprobe nicht enthalten (3.8)

Number of pairs of vertices excluding the vertex itself

3.9. What is Classification?

Classification maps data into predefined groups or classes. It is often referred to as supervised learning because the classes are determined before examining the data. The steps related to classification method are precisely described below [19].

- Aim: predict categorical class labels for new tuples/samples.
- Input: a training set of tuples/samples, each with a class label.
- Output: a model (a classifier) based on the training set and the class labels.

3.10. Decision Tree Induction

The decision tree approach is most useful in classification problems. With this technique a tree is constructed to model the classification process. Once the tree is built, it is applied to each tuple in the database and results in a classification for that tuple. There are two basic steps in the technique, building the tree and applying the tree to the database. An example of decision tree is shown below in figure 3.2.

Abbildung in dieser Leseprobe nicht enthalten

Figure 3.2.Decision Tree Generation

A decision tree is a tree where,

- internal node = a test on an attribute
- tree branch = an outcome of the test
- leaf node = class label or class distribution

3.11. Herd Behavior

Herd behavior describes when a group of individuals performs actions that are highly correlated without any plans, where network is observable and only public information is available [15].

Example: Stanley Milgram asked one person to stand still on a busy street corner in New York City and stare straight up at the sky. About 4% of all passersby stopped to look up. When 5 people stand on the sidewalk and look straight up at the sky, 20% of all passersby stopped to look up. Finally, when a group of 18 people look up simultaneously, almost 50% of all passersby stopped to look up.

3.12 Collective Behavior

A group of individuals are behaving in a similar way. It can be planned and coordinated, but often is spontaneous and unplanned [15]. First it was defined by sociologist Robert Park.

Examples:

- Individuals standing in line for a new product release
- Posting messages online to support a cause or to show support for an individual

We can analyze collective behavior by analyzing individuals performing the behavior. We can then put together the results of these analyses .The result would be the expected behavior for a large population. It is popular for Prediction purposes.

Chapter 4

Find Communities Depending on Criminal Incidents & Recommend Result to Users

In social network analysis, analyzing communities is essential. Studying communities in social media is important for many reasons. First, individuals often form groups based on their interests, and when studying individuals, we are interested in identifying these groups. Second, groups provide a clear global view of user interactions, whereas a local-view of individual behavior is often noisy and ad hoc. Finally, some behaviors are only observable in a group setting and not on an individual level. This is because the individual’s behavior can fluctuate, but group collective behavior is more robust to change

4.1. Objectives

In this method we are treating different criminal incidents as nodes and victims details related to that criminal incidents defined as the node characteristics. Then the same types of criminal activates will be forming communities. Then according to the user profile crime incidents, will be recommended and will be advised to take preventive actions. At first data will be collected, so a predefined schema will be followed to collect data of victims.

4.2. Data Model

We are looking to our database table and try to find the relations among data. First three fields are for the information fields about the user or victims as name, surname and email id. In next fields as gender, location, social status, age category and crime type will participate to create the different classes. Gender basically two types (male and female), locations, social status, age and crime types are different for the different users. The model is shown in table 4.1 and the functionality is described.

Abbildung in dieser Leseprobe nicht enthalten

Table 4.1.Data structure represents information given by victims

There are basically 3 different areas; Age can be taken as given range (as 15 to 30, 30 to 45 etc), so 3 different age types, 3 different social status and 4 different crime types. So it will create a class by depending on the selection of each attributes by the user. If we consider area A, it will have two different gender value, 3 different age value, 3 different social status and 4 different types of crimes. Only for area A the different types of classes will be created, then for area B and C correspondingly many classes will be created. By calculating those classes finally we generate the final dataset. But in our problem classes are not required, and it is left for future use if required.

4.3. Proposed Model 1

We can explain our work graphically through a flow chart which is shown in figure 4.1. Initially we need to collect data of victims related to different types of criminal activities. Next create groups of criminal Incidences (Based on Member Based Community Detection – Node Similarity, here node is incidence). Then inform the users and advise to take preventive actions (Content Based Method will be used).

Abbildung in dieser Leseprobe nicht enthalten

Figure 4.1.Model 1 Representation of the Process

Example:

Initially select different types of criminal activities which are mostly occurred in India. In India there are more than 200 different types of criminal activities have been detected and if consider the IPC charges, then the number will be huge. So here 56 different types of criminal activities are taken. Then every criminal activity is described by the victim’s characteristics according to the age, social status, gender and location. After collecting data from victims now each of the criminal activity are treated here as a node and victims behavior related to that node describing the node’s characteristics [20].

Now the similarities between nodes are calculated. When using network information, the similarity between two nodes can be computed by measuring their structural equivalence or their regular equivalence. Here structural similarity is used. Next some groups are detected with the help of community detection algorithm. There are different types of community detection algorithm, member based community detection is used here. Grouping also can be done using clustering methods but the major difference between clustering and community detection is that in community detection, individuals are connected to others via a network of links, whereas in clustering, data points are not embedded in a network. In this problem before calculating the node similarity vector space model will be used to represent the nodes.

In content-based recommendation, we compute the topmost similar items to a user j and then recommend these items in the order of similarity [15]. The process is described with an example (figure 4.2) below:

Abbildung in dieser Leseprobe nicht enthalten

Table 4.2.Group representation of criminal incidents

For this group A1 occurred 3 times in 3 rows, so we can say A1’s occurrence rate is 100%, whereas A2 occurred 2 times out of 3 rows. So we can find that A2’s occurrence rate is 66%.

In next column we can see that occurrence rate of LMC is 1 out of 3 rows, and it is 33%. Let the threshold value is 35% then we will be discarding the LMC and will be taking the remaining items. So threshold value is not constant and it must be selected depending on different situations. But whatever the threshold value will be selected, it will remain same for all the groups.

After applying this process we are finally getting the groups. Now groups are represented using a vector space model and a user profile has been taken and also represented using vector space model. Now the cosine similarity is applied, the process is given below (figure 4.3, 4.4, 4.5, 4.6, 4.7):

Abbildung in dieser Leseprobe nicht enthalten

Table 4.3.Vectorization process (1)

Abbildung in dieser Leseprobe nicht enthalten

Table 4.4.Frequency-idf values

Abbildung in dieser Leseprobe nicht enthalten

Table 4.5.Vectorization process (2)

Now we are randomly selecting a profile and apply the similarity calculation

Table 4.5.Vectorization process

Table 4.6.User Details

Table 4.5.Vectorization process

Table 4.7.User tf-idf details

Finally we are getting the details of similarities between user profile and groups, which is given in the table 4.8 below:

Table 4.5.Vectorization process

Table 4.8.Similarity Calculation

This information is required for the recommendation process.

4.4. Result Analysis 1

In result we are getting different classes. In our example we can generate 18 different classes. Here we fixed an area and particular crime type where changing the other attributes such as gender, social status and age. It will generate 18 different classes. If we consider a particular type of crime for three different areas then it will generate 3x18=54 different classes. Further if we apply four different crime types then it will generate 4x54=216 different classes. But initially we are considering only 18 classes and trying to analyze the result.

So we can generate some decision rules before working in WEKA, and the rules are following the structure which is shown.

IF SS=S1 && AL=A1 && G=M THEN D=CL1

IF SS=S1 && AL=A1 && G=F THEN D=CL2

….

IF SS=S3 && AL=A3 && G=F THEN D=CL18

Where SS=social status, AL=age limit, G=gender and D=decision.

Now we have seventy five numbers of different instances which creates 75 different classes. Now calculate the frequency of the classes and generate the occurrence rate. It can be clearly explained through a column chart (figure 4.2).

Abbildung in dieser Leseprobe nicht enthalten

Figure 4.2.Column chart representation of crime record

Decision tree is generated (figure 4.3) here according to the particular area and particular crime type. Here according to the social status it will reach to the age. Checking the age limit it will reach to the gender and then according to the gender it will generate the class as a result.

Abbildung in dieser Leseprobe nicht enthalten

4.3.Decision tree generation in WEKA

Now analyze the result and try to find how it can be useful for society. Initial figure shows that the class 10 is mostly affected it means the persons who are having social status ‘mc’, age limit ‘a2’ and gender ‘female’ are highly affected by a particular type of crime for a particular area. This Information supposes to be helpful for society. But not only that we can individually calculate that who are more affected than whom.

Abbildung in dieser Leseprobe nicht enthalten

Figure 4.4.Age, Gender and Social Status representation

From figure 4.4 we can derive that the age limit a1 and a2 are more affected than a3 and their occurrence rate of a1 and a2 are 36%. From figure we can derive that the gender female are more affected than male and the occurrence rate of f is 54.66% and lastly the social status mc are mostly affected where occurrence rate is 40%.

Now the similarities between nodes are calculated. When using network information, the similarity between two nodes can be computed by measuring their structural equivalence or their regular equivalence. Here structural similarity is used. In our problem each nodes are representing different types of criminal activities and their characteristics are described by the victim’s information. Initially we have to select a number of groups. For 56 different criminal activities we are selecting 12 groups and they will be treated as 12 nodes. At the time of selection we have to calculate total number of unique nodes present in the document and then from those unique rows, randomly 12 nodes are selected [21]. The process is successfully implemented using java.

Next 12 unique nodes are represented using vector space model into matrices and vectors and member based community detection algorithm is applied. The intuition behind member-based community detection is that members with the same (or similar) characteristics are more often in the same community. Three general node characteristics that are frequently used are node similarity, node degree (familiarity), and node reach ability. Here to solve the problem node similarity will be used. Then a new node will be selected and converted into vector space model.

When inserting the values then there are some attribute values line NA and Any which defines those values are not up to the marks. Just take an example for 3 areas a1, a2, a3 whose occurrences rate are 33.3% for all of them. Each of them has a same probability as a1=1/3, a2=1/3 and a3=1/3.

Then a1a2a3together=1/3+1/3+1/3=1

Informationa1a2a3 α (1/probability of occurancea1a2a3)

Abbildung in dieser Leseprobe nicht enthalten

That’s why for attribute value any we will be putting value 0. And for attribute value NA which is not available, also be represented with 0, because, if we do not have any information, then it can be counted as 0.

Now calculate the relation with that new node with the other 12 nodes which are treated as different groups. Then cosine similarity is used to calculate the similarity with each group to that particular node. In this process the similarity between 12 groups and other 20 nodes has been calculated. Now we are going to represent the nodes using a graph where edges are representing the similarity between two nodes using weights.

In figure 4.5 the thicker edges between two nodes are having higher weight age than any other edges. Weights of the edges are calculated from the similarity. So thin edges are representing the lower similarity where as thick edges are representing the higher similarity.

Abbildung in dieser Leseprobe nicht enthalten

Figure 4.5.Nodes representation of criminal incidents

Next in figure 4.6 we will give the details of the edges;

Abbildung in dieser Leseprobe nicht enthalten

Figure 4.6.Edge details of criminal incidents

Initially we have selected 12 groups. After applying vector space model and the similarity calculation we are getting the communities which are:

{d2,d4}, {d5,d6}, {d9,d3,d8,d20}, {d14,d27}, {d22,d21,d23}, {d26,d25}, {d31,d29,d30}, {d40,d18,d51}, {d45,d43,d44,d46}, {d50,d1}, {d54,d53,d55}, {d17,d16}

So as a result we are getting the 12 groups or community and their members. Each groups represent same types of criminal activities in terms of victim’s records. Victims are more or less same in each community in terms of age, gender, social status location etc.

We already got the different groups. Now we will recommend the top three groups to a person by matching the profile information with the group behavior. The process will be working like awareness to the user that is related to their profile information and he/she can be affected by those criminal activities. It’s not logical to say that a person will have a chance to be affected by a criminal activity, so we are trying to provide user, few groups of criminal activities from which he/she will be careful and we will recommend top three similar groups to them.

Now we got the similarities between user details and group details from table 18. Now if we are taking top 3 similarities from the above table then it is shown that G1, G9 and G12 these group activities are mostly similar to the user profile. Then user’s can be informed through some mobile application and social networking sites about these activities and advised to take preventive steps [22].

4.5. Summary

The similarity between nodes ware calculated. When using network information, the similarity between two nodes can be computed by measuring their structural equivalence or their regular equivalence. Here structural similarity is used. Some groups are detected with the help of community detection algorithm. Then we have recommended the top three groups to a person by matching the profile information with the group behavior.

Chapter 5

Detect Criminal Network & Group Activity Using Collaborative Filtering

Collaborative filtering is another set of classical recommendation techniques. In collaborative filtering, one is commonly given a user-item matrix where each entry is either unknown or is the rating assigned by the user to an item. In collaborative filtering, one aims to predict the missing ratings and possibly recommend the cartoon with the highest predicted rating to the user. This prediction can be performed directly by using previous ratings in the matrix. This approach is called memory-based collaborative filtering because it employs historical data available in the matrix.

5.1. Objectives

In this method we are treating different criminal incidents as nodes and victims details related to that criminal incidents defined as the node characteristics again. Then same types of criminal activities are forming communities. These crime incidents are related with the different suspects, and using this data with the help of collaborative filtering we will by trying to find criminal networks and gang activities among suspects.

5.2. Proposed Model 2

We can explain our work through a flow chart which is shown in figure 5.1.

Abbildung in dieser Leseprobe nicht enthalten

Figure 5.1.Model 2 Representation of the Process

The process of data collection and community detection has already been described in chapter 4. In this chapter same process will be followed to collect data and detect communities. After detecting community collaborative filtering, betweenness centrality and SVD technique will be used to find criminal network and detect gang activities among suspects. The process can be explained using examples.

Example:

To find the criminal networks and important links in that network we are going to use the concepts of collaborative flittering and betweenness centrality. Now the example will explain the process below [23].

Abbildung in dieser Leseprobe nicht enthalten

Table 5.1.Suspect’s ratings

In table 5.1 xyz1 to xyz 5 are different suspects who are related to different criminal behavior of group 1. There is some rating system out of 5 to rate them on their intensity of criminal activities.

Now here 0 value means the person is not at all related to that criminal activities and the police department is confirmed about that. But there is a position where we can see that a? Mark is present which define that somehow person is related to that crime but cannot able to rate him. So missing rating can be found by using cosine similarity with the other users and the process is defined below [15].

Rating between xyz4 and d44 is missing….

Average rating…

rxyz1 =3 + 0 + 3 + 3/4 = 2.25

rxyz2 =5 + 4 + 0 + 2/4 = 2.75

rxyz3 =1 + 2 + 4 + 2/4 = 2.25

rxyz4 =3 + 1 + 0/3 = 1.33

rxyz5 =2 + 2 + 0 + 1/4 = 125

Similarity between xyz4 and others…..

sim(xyz4, xyz1) =3*3+1*3+0*3/√10√27= 073

sim(xyz4, xyz2) =3*5+1*0+0*2/√10√29=0.88

sim(xyz4, xyz3)=3*1+1*4+0*2/√10√21= 0.48

sim(xyz4, xyz5) =3*2+1*0+0*1/√10√5=0.84

If we are taking as neighborhood 2 then xyz4 is most similar with xyz2 and xyz5. And rating for xyz4 is;

Abbildung in dieser Leseprobe nicht enthalten

Rxyz4,d44= 1.33 +0.88(4 – 2.75) + 0.84(2 – 1.25)/0.88 + 0.84 = 2.33

Now we already got the missing rating and replaced it with the question marks. We are having complete table now below (table 5.2):

For Group1

Abbildung in dieser Leseprobe nicht enthalten

Table 5.2.Suspect’s rating for group1

Now we are taking more two groups (table 5.3 & table 5.4) which will be required to analyze the results in future.

For Group2

Abbildung in dieser Leseprobe nicht enthalten

Table 5.3.Suspect’s rating for group2

For Group3

Abbildung in dieser Leseprobe nicht enthalten

Table 5.4.Suspect’s rating for group3

Next we are going to find the common behaviors among the criminal’s towards the different crime activities. Let there are five different crime activities which are in a same group and we assume that they are D1, D2, D3, D4, and D5. We have got the details of 20 suspects who are anyhow related to these criminal activities. Using the following table 5.5 we can rate them which are given below:

Abbildung in dieser Leseprobe nicht enthalten

Table 5.5.Suspect’s rating for behavior analysis

Now we are going to use the concepts of model based collaborative filtering and apply single valued decomposition technique to decompose the given matrix. After decomposition we are getting 3 different matrices (table 5.6, table 5.7, and table 5.8). Considering a rank 2 approximation (i.e. k=2) we truncate all three matrices. The process can be implemented in Matlab using svd command.

U=

Abbildung in dieser Leseprobe nicht enthalten

Table5.6.Decomposed value of Suspect’s rating

Abbildung in dieser Leseprobe nicht enthalten

Z=

Table 5.7.Diagonal Matrix

Abbildung in dieser Leseprobe nicht enthalten

VT =

Table 5.8.Decomposed value for crime incidents ratings

The rows of U represent Suspects. Similarly the columns of VT (or rows of V) represent crime activities. Thus, we can plot suspects and crime incidents in a 2-D figure. By plotting suspect’s rows or criminal incidents columns, we avoid computing distances between them and can visually inspect that are most similar to one another.

5.3. Result Analysis 2

To find the criminal networks and important links in that network we are going to use the concepts of collaborative flittering and betweenness centrality. From local police station or from websites of police department we can collect the details of suspects. A suspect can be related to different types of criminal activities and based on their activities we can rate them as wanted or most wanted or something like that. These all are taken as discrete values. Now convert those discrete values into continuous values and use those to rate them according to the different criminal activities.

Now each group is describing the details of the suspects who are responsible for different types of criminal activities where the victim’s characteristics are same according to the location, age social status and gender. A prediction can be done from this that the suspects may belong from the same gang or they may have some relations to each other. So gang activity can be detected from these groups.

If we are looking to the tables then we can found that there are some suspects who are belonging to the different communities but if we calculate their total ratings for each group then we can find that they are not from the high rated criminals4.

Now we are representing the suspects of different communities through an undirected graph and then connect the suspects according to their groups. The figure 5.2 describes the concept.

Abbildung in dieser Leseprobe nicht enthalten

Figure 5.2.Suspect’s network

So from figure 5.2 we are going to calculate the betweenness centrality of each node and from figure 5.3 we are getting the betweenness centrality;

Abbildung in dieser Leseprobe nicht enthalten

Figure 5.3.Betweenness centrality representation

So from the graph it can be seen that the between centrality of xyz5 is highest where as betweenness centrality for xyz4 is the second highest.

It can be deducted from this information that xyz5 is most valuable person in terms of information though his participation to the criminal activities is lower than any other suspects. From the graph it is clearly found that he is being linked with different communities. So he is communicating with different gangs and working as a linked person. If he can be tracked and found then police department will be able to retrieve information about the other suspects. Here in this example xyz4 also have the betweenness centrality higher than any other suspects so he is also valuable in terms of information [18].

From this model of community detection we can grouped the same types of criminal activities and then rates the suspects by putting them in to the groups, from which gang activity can be assumed [24]. From those suspects with the help of betweenness centrality some suspects can be found who are valuable in terms of information and if they can be tracked and taken into the custody, then after interrogation the real gang activity or details about other suspects can be retrieved [25].

Previously we have seen from crime ratings of suspect’s that they were supposed to be a same group and tried to detect important links among them with the help of betweenness centrality. But now with the help of behavior analysis of individual suspect’s behavior towards different criminal activities we will try to predict gang activity more accurately and finding the persons who is important in that gang. For this process we have taken the help of matrix decomposition techniques which will be single valued decomposition.

There are some criminal activities which are basically group-depended or gang depended. This technique will be helpful to find the gang members which will provide some leads to investigation authority. Basically crime data is depending on evidence, and interrogations. So, if from this method investigation authority will be getting some clues about gang members then it will help them to interrogate the suspects.

We already got different groups according to victim’s details and on same location. So we are assuming 5 different crimes D1, D2, D3, D4, and D5 and define few suspects according to their ratings. Then matrix is decomposed using SVD technique.

Now In result we are getting 3 different tables. Then considering a rank 2 approximation (k=2) we truncate the matrix. Now we are getting three different matrixes as U, S, Vt .Where the rows of V is represents the suspect’s and similarly the columns of Vt (or rows of V) represents crime activities. Thus we can plot suspect’s and crime activities in a 2-D figure. By plotting suspect’s rows and crime activity column we avoid computing distance between them and can visually inspect crime activities or suspect’s who are most similar to each other. Initially we are having the plot as figure 5.4 which is given below:

Abbildung in dieser Leseprobe nicht enthalten

Figure 5.4.2-D plot representation of suspects & criminal incidents

From this figure we can see few points are similar to each other. The star formation points are defined as criminal incidents. From that graph it is clear that D4, D5 are similar to each others. So the suspects related to these crime activities may be related to each other. If some suspects are involved with crime D4 then there are some probabilities that they will also be related with D5. Because of the same behavior of two incidents, the solution of one criminal incident may provide solution to other criminal incident. Again we are producing another figure and removing the crime activity details from the graph and generating again; we follow figure 5.5 which is given below:

Abbildung in dieser Leseprobe nicht enthalten

Figure 5.5.2-D plot representation of Suspects

From this figure we can find the following suspects are more or less same according to their criminal behavior towards crime. According to the behavior few gang activities can be observed and they are: S1: {xyz7, xyz8, xyz3, xyz4, xyz17, xyz18, xyz11, xyz12, xyz13, xyz14}, S2: {xyz1, xyz6, xyz15}, S3: {xyz5, xyz9}, S4: {xyz19, xyz10}, {xyz2}, {xyz3}, {xyz16}. There are 4 different gang activity clearly observed where S1 has the most gang members. S2, S3, S4 are having lower gang members but S3 and S4 are somehow connected with S1 gang through a link because xyz17 and xyz7 are situated between S1, S3 and S1, S4. From this graph, it can be seen that three suspects as xyz2, xyz3, xyz16 are isolated and they are not belonging to any gangs. Now we are representing them using a graph (figure 5.6) and calculate betweenness centrality. Then we can find the betweennes centrality of two persons xyz7, xyz17 are very high than others. They may be related to two different gangs and they are working as gatekeeper or communicator between two different gangs.

Abbildung in dieser Leseprobe nicht enthalten

Figure 5.6.Suspect’s connectivity

We can assume that the crime incident that satisfies each member of the group on average are preferred criminal activities by that group. And to find the most preferable activity by the groups we are taking the maximum ratings in the group ratings as group ratings. We already have done the calculation for groups S1, S2, S3, and S4 are explained in figure 5.7 and figure 5.8.

Abbildung in dieser Leseprobe nicht enthalten

Figure 5.7.preferable crime activity by a group

Abbildung in dieser Leseprobe nicht enthalten

Figure 5.8.Most preferable crime activity by a group

List of Deductions:

1. There are 4 different groups present who are related to 5 different criminal activities, which are mostly the gang activities.
2. Criminal incidents D4, D5 are similar to each other in a same area, so if one case will be solved then it will be providing the leads of other criminal incident.
3. Group S1 is powerful and their strength is higher than any other groups.
4. Two persons are working as gatekeeper or linked personal between two different groups.
5. 3 suspects are different in criminal nature and it is assumed that they are not belonging from any groups.
6. xyz3 may work alone but there are some chances that he may be attached with the group S2.
7. The members of group S1 are interested in crime D1 but for few members, D1 crime is most preferable dew to some reasons. Same can be calculated for other groups also.
8. In narcotics selling group suspects prefer to sell their products in some places where demand is high and secure. They preferred the same place to sell their products based on environmental situations, which is called repeat or near repeat actions. If one of the gang members can be found by identifying the place then the police have idea about the gang members, and they can directly asked the suspect about others.

5.4. Summary

The similarity between nodes ware calculated. When using network information, the similarity between two nodes can be computed by measuring their structural equivalence or their regular equivalence. Here structural similarity is used. Some groups are detected with the help of community detection algorithm. To find the criminal networks and important links in that network, the concepts of collaborative flittering and betweenness centrality are used and identified the suspects. To find the common behaviors among the criminal’s towards the different crime activities, SVD technique is used successfully.

Chapter 6

A Statistical Model to Determine the Behavior Adoption in Victims in Different Timestamps

Herd behavior describes when a group of individuals performs actions that are highly correlated without any plans, where network is observable and only public information is available and Collective behavior describes when a group of individuals are behaving in a similar way, it can be planned and coordinated, but often is spontaneous and unplanned.

Example: Assume a person is on a trip in a metropolitan area that he is less familiar with [15]. Planning for dinner, he finds restaurant A with excellent reviews online and decide to go there. When arriving at A, he sees A is almost empty and restaurant B, which is next door and serves the same cuisine, almost full. He decides to go to B, based on the belief that other diners have also had the chance of going to A, and is an example of herd behavior. Now due to some reasons restaurant B is almost full and maximum persons are selecting that restaurant though it’s review not good as restaurant A.

6.1. Objectives

If a person will be going to restaurant in some other time (consider the previous example), he probably can find more people in restaurant A than B. So when that person will go to restaurants, before him a set of people already selected the restaurant B, as there was something attractive in compare to restaurant A. So we can say at time t1 a few set of people are attracted by the restaurant B due to some reasons. But at time t2 when other people are going to the restaurant B, there may be those attractions presents or may not be, but due to the present of a group of persons of time t1 also attract the persons of time t2. It may continue to different timestamps t3, t4 and so on. But this behavior initiated from t1 we are going to use this concept to analyze the comments of social networks.

It is observed that, when people are writing comments in a Facebook posts, most of the time they are not reading the complete content. They are reading few lines and getting some idea about the posts and then they see the previous comments given by others and their friends, and then they write their comments. Here firstly we are to determine the minimum requirement by which we can find that the persons of other timestamps are following/adopting the behavior of timestamp t1. And secondly, we are to predict the behavior of upcoming timestamps from the initial timestamp t1.

6.2. Survey on Facebook to Collect Data

To perform this analysis we need to collect the data from some social networking sites, and for that purpose we have selected Facebook5. In Facebook we have published a post, and seeking user’s opinion on a topic- if they are supposed to be a victim of criminal activities like pick pocketing, cyber crime, theft etc. There were few options available which is chosen based on words which are usually used by people after being affected by any criminal activities. So we are having seven most frequent words which are often used by victims of criminal activities and a person can select any number of words. Here we are having a snapshot of that page (figure 6.1):

Abbildung in dieser Leseprobe nicht enthalten

Figure 6.1.Snapshot of survey page

We have run the survey from 7th April, 2015 to 23rd April, 2015 and got the opinions of 160 persons. We divided the time duration of 17 days into 4 timestamps as t1, t2, t3, t4 where in t1 (4 days) got 40 opinions (U1 to U40), t2 (6 days) got 45 opinions (U41 to U85), and t3 (4 days) got 40 opinions (U86 to U125) and t4 (3 days) got 35 opinions (U136 to U160).

6.3. Proposed Procedure

Initially we are having different timestamps and according to that timestamps groups are created. At timestamp t1 a group behavior will be calculated and then at timestamp t2 again group behavior will be calculated. If at least one behavior will be same in that two timestamp then herd behavior or adoptions can be found, but if no common behavior will be there then there will be no herd behavior. Again when we are going to find the adoption of behavior for timestamp t3 then it will need to match at least one behavior with the total group behavior of timestamp t1+t2. Again For timestamp t4, it will need to match at least one behavior with the total group behavior of timestamp t1+t2+t3. And it will be going this way. Now how it can be done is explained below.

1. At timestamp t1 S1 is the set of documents where S1= {d11, d12, d1n}. Each document is having few texts.
2. Use modified vector space model to find out the strength of each document by calculating the ratio of total no of a particular text appear in the set S1 and total no of documents in S1.
3. Identify the document with highest strength (dsh1).
4. At timestamp t2, S2 is another set of documents where S2= {d21, d22, d2n}.
5. Use modified vector space model to find out the strength of each document by calculating the ratio of total no of a particular text appear in the set S2 and total no of documents in S2. Identify the document with highest strength (dsh2).
6. Apply cosine similarity with dsh2 to all documents of S1.
7. If dsh2 is most similar with dsh1 then the adoption in behavior has been occurred.
8. If dsh2 is most similar with any other document danother1, which is subset of dsh1, where at least one common behavior will be found between dsh1 and danother1 then the adoption has been found.
9. Then find out the adoption for t1+t2 which can be done using {dsh1∩dsh2} or {dsh1∩danother1} which is ns1.
10. Then in timestamp t3 modified vector space model will be used to find out the strength of each document by calculating the ratio of total no of a particular text appear in the set S3 and total no of documents in S3. Identify the document with highest strength (dsh3). Apply cosine similarity with dsh3 to all documents of S1.
11. If dsh3 is most similar with dsh1 then herd behavior has been occurred or if dsh3 is most similar with any other document danother2, where at least one behavior common with ns1, then the adoption has been found.
12. Then find out the herd behavior for t1+t2+t3 which can be done using {dsh1∩ns1} or {ns1∩danother2} which is ns2 and the result will be used for next timestamp. This process can be used to tn timestamps to find adoption.

This behavior explained the information diffusion from timestamp t1 to other timestamps and how the persons of other timestamps are following the information of the groups.

6.4. Result Analysis 3

There are two different questions can be occurred, and they are;

1. Can it be done by using term frequency of overall documents?

We cannot define herd behavior by using term frequency or by using percentage only. Let’s follow the example given below;

Abbildung in dieser Leseprobe nicht enthalten

Table 6.1.Example of term occurrences

In this table 19 users are divided into 3 timestamps as t1, t2, t3 and each timestamps there are 3 users. To follow herd behavior, the information of timestamp t1 will be diffused into t2 and t3. The overall frequency of B1, B2, and B3 is 6, 5, and 6 respectively. But in t1, b3 occurred 0 times. So overall occurrence or frequency or percentage of B3 is same as B1 as and more than B2, but it is not following the herd behavior or group behavior of t1. If we remove t1 and if t2 will be followed by later timestamps then b3 will be considered.

2. Why select term frequency rather than idf values?

If we use idf weighting then it is penalizing popular terms. Vector space model uses idf value which will not be effective to find the strength of the documents. So rather than using idf values we are using the ratio between term occurrence and total no of documents. That is explained in the following figure 6.2.

Abbildung in dieser Leseprobe nicht enthalten

Figure 6.2.IDF weighting vs. Doc Frequency

IDF(W)=log[(M+1)/K]

M=total no of docs in collection

K=total no of docs containing W (doc frequency)

At the end of time t1 U30 has the highest streangth or powerfull than others, so his behavior can be assumed the group behavior at t1. At the end of time t2 U76 has the highest streangth, so his behavior can be assumed the group behavior at t2. At the end of time t3 U122 has the highest streangth, so his behavior can be assumed the group behavior at t3. At the end of time t4 U76 has the highest streangth, so his behavior can be assumed the group behavior at t4.

Now this four persons are having the following characteristics:

U30 = {1,2,3,5,6,} at time t1

U76= {1, 3, 4, 5, 6, 7} at time t2

U122= {1, 2, 3, 4, 5, 6} at time t3

U154= {1, 3, 4, 5, 6, 7} at time t4

Now calculate the similarity with U76 to other members of timestamp t1. It is found that U76 of t2 is most similar with U10 of t1 Where U10= {1, 3, 5, and 6}. U10 is subset of U30 and after U10∩U30, we are getting ns1= {1, 3, 5, 6}. This behavior diffused from t1 to t2. Now calculate the similarity of U122 to other members of timestamp t1. It is found that U122 of t3 is most similar with U1 of t1 where U1= {1, 2, 3, 4, 5}. U1 is not the subset of U30 (highest strength). So after performing the operations ns1∩U1, we are getting ns2= {1, 3, 5}. This behavior diffused in the timestamp t3 from t1 and having another layer as t2. So after timestamp t3 we can predict that the set of collective behaviors {1, 3, and 5} are diffused from timestamp t1 to other timestamps and we can predict that in next timestamps we will be having this behaviors common for the groups. If we are to satisfy the minimum requirement for herd behavior then at least one behavior which will be counted as group behavior must common with initial group (timestamp t1). At timestamp t1 a group behavior is calculated and then at timestamp t2 again group behavior will be calculated. If at least one behavior is to be remaining same in that two timestamps then herd behavior can be found, but if no common behavior will be there then there will be no herd behavior. Again when we are found the herd behavior for timestamp t3 then it will match at least one behavior with the total group behavior of timestamp t1+t2 which is ns1. Again For timestamp t4, it will match at least one behavior with the total group behavior of timestamp t1+t2+t3 which is ns2. And it will be going on in this way

Now again we are going to find the similarity with U154 to other members of timestamp t1. It is found that U154 of t4 is most similar with U15 of t1 Where U15= {1, 3, 5, and 7}. U15 is not the subset of U30 (highest strength). So after performing the operations ns2∩U15, we are getting ns3= {1, 3, 5}. This behavior diffused in the timestamp t4 from t1 and having another layers as t2 and t3.

So for timestamp t4 our prediction of collective behaviors worked well.

Now we are having a graph (figure 6.3) that describes the behavior occurrence rate at the end of this survey.

Abbildung in dieser Leseprobe nicht enthalten

Figure 6.3.Occuranve rate of each term

From this figure we can say most of the person selected angry (1), and then worried (5) and annoyed (2) are likely to be equal, and then cheated (3) and helpless (6) are likely to be equal and at last frustrated (4) and Revengeful (7) respectively. But at the end of timestamp t3 we found that annoyed and helpless already been removed where as cheated is counted as collective behavior when it’s selection probability is less than annoyed and near about same as helpless.

Now consider the following figure 6.4 by which we are going to explain why annoyed will be removed at the end of timestamp t2.

Abbildung in dieser Leseprobe nicht enthalten

Figure 6.4.Term growth in different timestamps

The selection probability of annoyed is highly reduced at timestamp t2 which is 55.5% where as the selection probability of cheated reduced but 26.6% (figure 6.5).

Abbildung in dieser Leseprobe nicht enthalten

Figure 6.5.Percentage decrement of Annoyed & Worried at t2 over t1

At timestamp t2 the selection possibility of annoyed reduced more than the reduction possibility of cheated. Angry and annoyed are the most frequent selections in timestamp t2.If we calculate the probability of selection {1, 2, 5}, and {1, 3, 5} then we can see that {1, 3, 5} is having the higher chance to be selected. We can see the following figure 6.6.

Abbildung in dieser Leseprobe nicht enthalten

Figure 6.6.Chances to be selected {1, 3, 5} over {1, 2, 5}

That means most of the time when someone selects 1 and 5 they have selected more times 3 with them in comparison to 2. That is why at timestamp t2, annoyed is removed and cheated are remain same.

6.5. Summary

We have used the concepts of herd behavior and collective behavior. Facebook has been selected for online survey to study user’s behavior. We have run the survey from 7th April, 2015 to 23rd April, 2015 and got the opinions of 160 persons. A statistical model is proposed to determine the behavior adoption of victims in different timestamps. The model is generated after analyzing the collected dataset successfully.

Chapter 7

Conclusion & Future Scope

Criminology’s neglect of social network analysis serves as a warning that the discipline is failing to keep up with important developments in scientific inquiry, not to mention the fact that criminology is missing an opportunity to test and expand upon some of its most treasured theories and concepts. Historically solving crimes has been the duty of the criminal justice and law enforcement specialists. With the increasing use of the computerized systems to track crimes, computer data analysts have started helping the law enforcement officers and detectives to speed up the process of solving crimes. The process also helps the society by producing a crime statistics in front of them according to the various aspects. In today’s world, a best way to aware people about anything occurred in society is through social networks and throughout the project, trying to use this property in the field of criminology. Here the process has been implied on some local data or on some imaginary data. Classification technique can be applied to the process to generate faster and well defined result. The process must be implemented on real data and then the process can be modified as per requirement.

In near future this technique can be modified to find the crime pattern from the records collected from different cities. The numbers of fields can be increased such as time of crime, some description about the criminals, style of crime, etc. Through data analytics may be criminals can be detected and their location can be found. We can hope, this concept will create a new area in solving crimes and will help to reduce criminal activities in our society.

References

[1] Freeman, L. C. (1996). Some antecedents of social network analysis. Connections, 19(1), 39-42.

[2] Granovetter, M. S. (1977). The strength of weak ties. In Social networks (pp. 347-367). Academic Press.

[3] Dubner, S. J. (2008). Is MySpace good for society? A Freakonomics quorum. New York Times.

[4] Debnath, S., Das, D., & Das, B. (2017, December). Identifying terrorist index (T+) for ranking homogeneous Twitter users and groups by employing citation parameters and vulnerability lexicon. In International Conference on Mining Intelligence and Knowledge Exploration (pp. 391-401). Springer, Cham.

[5] Easley, D., & Kleinberg, J. (2010). Networks, crowds, and markets (Vol. 8). Cambridge: Cambridge university press.

[6] Dunham, M. H., & Sridhar, S. (2006). Data Mining: Introductory and Advanced topics, Dorling Kindersley (India) Pvt.

[7] Yamuna, S., & Bhuvaneswari, N. S. (2012). Datamining techniques to analyze and predict crimes. The International Journal of Engineering And Science (IJES), 1(2), 243-247.

[8] Cohen-Almagor, R. (2013). Internet history. In Moral, ethical, and social dilemmas in the age of technology: Theories and practice (pp. 19-39). IGI Global.

[9] Leiner, B. M., Cerf, V. G., Clark, D. D., Kahn, R. E., Kleinrock, L., Lynch, D. C., ... & Wolff, S. (2009). A brief history of the Internet. ACM SIGCOMM Computer Communication Review, 39(5), 22-31.

[10] Al Mutawa, N., Baggili, I., & Marrington, A. (2012). Forensic analysis of social networking applications on mobile devices. Digital Investigation, 9, S24-S33.

[11] Papachristos, A. V. (2011). The coming of a networked criminology. Advances in criminological theory, 17, 101-140.

[12] Dasu, T., & Johnson, T. (2003). Exploratory data mining and data cleaning (Vol. 479). John Wiley & Sons.

[13] Witten, I. H. (2013). Data mining with weka. Department of Computer Science University of Waikato New Zealand.

[14]Tang, L., & Liu, H. (2010). Community detection and mining in social media. Synthesis lectures on data mining and knowledge discovery, 2(1), 1-137.

[15] Zafarani, R., Abbasi, M. A., & Liu, H. (2014). Social media mining: an introduction. Cambridge University Press.

[16] Polcicova, G., & Návrat, P. (2000). Combining content-based and collaborative filtering.

[17] Skillicorn, D. (2004). Social network analysis via matrix decompositions: al Qaeda. School of Computing Queen's University.

[18] Xu, J., & Chen, H. (2005). Criminal network analysis and visualization. Communications of the ACM, 48(6), 100-107.

[19] Abual-Rub, M. S., & Abdullah, R. (2007). A modified vector space model for protein retrieval.

[20] Pyle, D. (1999). Data preparation for data mining. morgan kaufmann.

[21] Raman, V., & Hellerstein, J. (2001). Potters wheel: an interactive framework for data cleaning and transformation. Working draft

[22] Wang, R. Y., Storey, V. C., & Firth, C. P. (1995). A framework for analysis of data quality research. IEEE transactions on knowledge and data engineering, 7(4), 623-640.

[23] Katz, C. M. (2013, August). Understanding the role of social network analysis in intelligence led policing. In 2013 Phienix Police Department symposia for command staff.

[24] Lauchs, M., Keast, R. L., & Le, V. (2012). Social network analysis of terrorist networks: can it add value?. Pakistan Journal of Criminology, 3(3), 21.

[25] Haggerty, J., Lamb, D. J., & Taylor, M. J. (2009). Social Network Visualization for Forensic Investigation of E-mail. In WDFIA (pp. 81-92).

i Author’s own work: In This Paper, All Diagrams and Images Are Fully Original And Created By The Author Himself.

[...]

Abbildung in dieser Leseprobe nicht enthalten

1 http://socialnetworking.lovetoknow.com/Characteristics_of_Social_Networks

2 https://gephi.org/

3 https://amitdegada.weebly.com/download.html - Information Theory, Amit Degada

4 http://www.kolkatapolice.gov.in/ShowAllMostWanted.aspx

5 www.facebook.com

Details

Pages
51
Year
2015
ISBN (Book)
9783346118257
Language
English
Catalog Number
v517365
Institution / College
Indian Institute of Engineering Science and Technology, Shibpur – IBM INDIA PRIVATE LIMITED
Grade
7.8
Tags
social

Author

Previous

Title: Social network analysis in the field of criminology