The term “Big Data” is a broad definition that covers all data relating to institutions and individuals. Data extraction is one way of accessing information.
Data extraction techniques reveal the information behind the data and how to interpret it. In other words, it is defined as a process used to extract usable data from a large set of raw data, which means analysing data patterns in large data sets using one or more programmes.
Data extraction involves applications in multiple areas such as science and research, however, governments and large businesses use data extraction applications to learn more about their employees or customers and develop more effective strategies to make their programmes more successful and utilise resources more effectively through data collection and storage and processing by electronic means.
Big data is a term used for any data which is extremely large in quantitative terms and is used when any type of data is difficult to understand when using traditional methods of database management systems such as Microsoft Excel.
Data extraction is essentially the process of “searching for a needle in a haystack” through the process of accessing large data sets. Data extraction is important for decision-makers because it helps with screening large amounts of data to make decisions that are consistent with predetermined trends.
Systems in developed countries – whether at the level of governments or large commercial enterprises – utilise a range of innovative methods for the monitoring and supervision of operations. For example, in public procurement, data is scrutinised and monitored when governments issue tenders to identify suspicious activities, collusion patterns, and false or fabricated information. Data extraction is used to identify “corrupt intent” in payments or transactions through the visualisation of data.
This type of operation is carried out by researchers in a “specialist anti-corruption research centre” specialised in examining large volumes of public procurement data sets by looking for abnormal patterns such as exceptionally short tender periods or unusual results – for example, lack of competition against the winning bid or offers won by the company itself.
Anti-corruption software tools are available to detect fraud, including the use of “unusual” software on advanced computers to assimilate large data sets and administrative procedures as part of the “intelligent extraction of information.”
How can we use data extraction to combat corruption?
Data extraction has sparked a “data revolution,” generated by an explosion in data streams, where the use of data extraction techniques in the private sector to identify customer preferences and predict buying patterns has become common practice for large businesses. But can data extraction be used to combat corruption? If so, how does it work?
In Georgia, in 2014, Transparency International launched an open source data monitoring and analysis portal, using data extracted from the e-procurement sites of government institutions and reconstituted it in easy-to-use formats.
The open source portal included enabling users (government employees) to create profiles of procurement transactions by government agencies, profiles of companies bidding for public contracts, and aggregated statistical data for government expenditure.
In order to increase its capability to examine data and expose shortcomings in operations involving elected authorities and public finances, the European Commission, in cooperation with Transparency International, has developed a special data analysis programme that examines data from various public and private institutions to help identify projects that are at risk of fraud or irregularities.
Data extraction programmes can be used to detect tax fraud and improve compliance by taxpayers. Similarly, data extraction can be used to combat money-laundering by using accountancy software which analyses banking data and compares it with criminal data points which may then help to detect illicit cash flows, a high priority issue on the Transparency International agenda.
The wealth of data that can be gathered today through remote sensing, citizen’s reports from collective or community sources, news media, census data, mobile phone activity, social networking sites, etc., offers many opportunities to extract data. We cannot monitor national development without a data analysis centre, and policymakers cannot make the right decisions on policies which public institutions subsequently rely on, unless data extraction is used in planning.
To achieve this, big data can be analysed, and information extracted to detect and deter corruption through the collaborating of data experts with anti-corruption institutions to develop smart software technical applications, and then deploy powerful anti-corruption analyses.
For example, the Indian government uses an electronic application called “I paid a bribe” that allows citizens to report cases of bribery and fraud when dealing with government employees and officials. This e-application has helped to counter bureaucratic corruption which harms the majority of citizens during the course of their dealings with public institutions.
In Brazil, the government there used a special monitoring and analysis programme to track public spending to detect fraud in the largest social welfare programme by comparing the list of beneficiaries to the Federal Automobile Register, thereby identifying thousands of ineligible beneficiaries.
The World Economic Forum estimates that the cost of corruption reaches more than 5% of global GDP, with people paying more than $1 trillion in bribes annually. Data extraction is one of the most effective tools in detecting transactions associated with this sort of illegal behaviour. Today, it is common in most fraud and corruption investigations to pull raw data from Enterprise Resource Planning (ERP) systems to find anomalies.
In the final analysis, Iraq desperately needs a data collection and analysis centre to identify vulnerabilities in public and private institutions. Such centres require specialists in data extraction and analysis. Transparency International can assist with the establishment of such a centre or possibly the government can partner with a major international company specialising in this field, to achieve the desired goal in the fight against corruption.