Wednesday, March 23, 2016

Data processing

Early electronic computers such as Colossus made use of punched tape, a long strip of paper on which data was represented by a series of holes, a technology now obsolete. Electronic data storage, which is used in modern computers, dates from World War II, when a form of delay line memory was developed to remove the clutter from radar signals, the first practical application of which was the mercury delay line. The first random-access digital storage device was the Williams tube, based on a standard cathode ray tube, but the information stored in it and delay line memory was volatile in that it had to be continuously refreshed, and thus was lost once power was removed. The earliest form of non-volatile computer storage was the magnetic drum, invented in 1932 and used in the Ferranti Mark 1, the world's first commercially available general-purpose electronic computer.
IBM introduced the first hard disk drive in 1956, as a component of their 305 RAMAC computer system. Most digital data today is still stored magnetically on hard disks, or optically on media such as CD-ROMs. Until 2002 most information was stored on analog devices, but that year digital storage capacity exceeded analog for the first time. As of 2007 almost 94% of the data stored worldwide was held digitally: 52% on hard disks, 28% on optical devices and 11% on digital magnetic tape. It has been estimated that the worldwide capacity to store information on electronic devices grew from less than 3 exabytes in 1986 to 295 exabytes in 2007, doubling roughly every 3 years.

Databases

Database management systems emerged in the 1960s to address the problem of storing and retrieving large amounts of data accurately and quickly. One of the earliest such systems was IBM's Information Management System (IMS), which is still widely deployed more than 40 years later. IMS stores data hierarchically, but in the 1970s Ted Codd proposed an alternative relational storage model based on set theory and predicate logic and the familiar concepts of tables, rows and columns. The first commercially available relational database management system (RDBMS) was available from Oracle in 1980.
All database management systems consist of a number of components that together allow the data they store to be accessed simultaneously by many users while maintaining its integrity. A characteristic of all databases is that the structure of the data they contain is defined and stored separately from the data itself, in a database schema.
The extensible markup language has become a popular format for data representation in recent years. Although XML data can be stored in normal file systems, it is commonly held in relational databases to take advantage of their "robust implementation verified by years of both theoretical and practical effort". As an evolution of the Standard Generalized Markup Language , XML's text-based structure offers the advantage of being both machine and human-readable.

Data retrieval
The relational database model introduced a programming-language independent Structured Query Language , based on relational algebra.
The terms "data" and "information" are not synonymous. Anything stored is data, but it only becomes information when it is organized and presented meaningfully. Most of the world's digital data is unstructured, and stored in a variety of different physical formats[b] even within a single organization. Data warehouses began to be developed in the 1980s to integrate these disparate stores. They typically contain data extracted from various sources, including external sources such as the Internet, organized in such a way as to facilitate decision support systems (DSS).

Data transmission
Data transmission has three aspects: transmission, propagation, and reception. It can be broadly categorized as broadcasting, in which information is transmitted unidirectionally downstream, or telecommunications, with bidirectional upstream and downstream channels.

XML has been increasingly employed as a means of data interchange since the early 2000s, particularly for machine-oriented interactions such as those involved in web-oriented protocols such as SOAP, describing "data-in-transit rather than ... data-at-rest". One of the challenges of such usage is converting data from relational databases into XML Document Object Model structures.

Data manipulation
Hilbert and Lopez identify the exponential pace of technological change (a kind of Moore's law): machines' application-specific capacity to compute information per capita roughly doubled every 14 months between 1986 and 2007; the per capita capacity of the world's general-purpose computers doubled every 18 months during the same two decades; the global telecommunication capacity per capita doubled every 34 months; the world's storage capacity per capita required roughly 40 months to double (every 3 years); and per capita broadcast information has doubled every 12.3 years.

Massive amounts of data are stored worldwide every day, but unless it can be analysed and presented
effectively it essentially resides in what have been called data tombs: "data archives that are seldom visited". To address that issue, the field of data mining – "the process of discovering interesting patterns and knowledge from large amounts of data" – emerged in the late 1980s.

No comments:

Post a Comment