SiSTec SIERRA LEONE

ICT IS THE DRIVING TOOL TO DEVELOPMENT IN ANY COUNTRY

             SisTec COURSES    

INTRODUCTION TO INTERNET AND NETWORKING, COMPUTER, OPERATING SYSTEMS, ENVIRONMENTAL PROTECTION, COMPUTER HARDWARE AND BASIC ELECTRONICS PRATICALS ONLINE TUTORIAL FOR HIGH SCHOOLS AND UNIVERSITY LEVEL

 

 

INTRODUCTIOM TO NETWORKING AND INTERNET

tree graphic with many Internet symbols/icons hanging from the tree's branches: a camera, the twitter logo, the facebook logo, an email icon, etc.

The Internet is probably the most exciting, the most popular, most visible and definitely the “coolest” information systems development of the decade.

What is the Internet?

The origins of the Internet can be found in the early sixties, when the U.S. Department of Defense sponsored a project to develop a telecommunications network that would survive a nuclear attack. It had to link together a diverse set of computers and work in a decentralized manner so that, if any part of the network were not functioning, network traffic would automatically be re-routed via other network nodes. This project quickly grew into a popular academic network linking virtually all major research institutions and U.S. universities. Soon other countries jumped onto the bandwagon, thus linking academics and researchers across the globe. True to the academic ethos, it quickly became a means for global information sharing. By now, businesses also got a piece of the action. This was spurred on by the trend to network the personal computers in home and business environments and the development of more user-friendly, graphical interfaces: the web-browser and the Windows operating system.

The Internet (or, more colloquially, the Net) consists of a huge and fast-growing number (hundreds of thousands) of interconnected networks linked together. Currently more than 100 million users are connected to the Internet. The popularity of the Internet can be explained by the amount of information it makes available: the equivalent of many libraries of information is stored on millions of computers (Internet hosts), much of it free of charge to all Internet users. This information is provided by educational institutions, governmental agencies and organizations, individuals, and increasingly by businesses. Hence, the Internet is frequently referred to as the Information Highway or the Infobahn.

But the Internet is more than just a huge information resource. Its initial purpose was to act as a communications network and it fulfills that role well. It is the transport mechanism for electronic mail, the transfer of computer files, remote computer access and even allows for voice calls. Businesses quickly realized the potential of the multimedia-enabled Internet for marketing purposes. Of late, more and more business transactions are being conducted via the Internet: electronic commerce (e-commerce) is the latest revolution to be embraced by the Internet community.

Electronic mail

A long line of brightly colored envelopes, each with a card inside with the "at" symbol on it.

Probably the most popular Internet service is electronic mail, more commonly known as email. This consists of the sending of messages composed on the computer, via a network, directly to the computer of the recipient who reads the message on his/her computer. Knowledge workers with access to e-mail write five to ten times as many e-mail messages as hand-written notes. The following are just some of the advantages of e-mail.

  • Reliability: although there is no guarantee, you will normally receive quick feedback if the address does not exist or there is a similar delivery problem.
  • Efficiency: many short-cut tools exist to increase your efficiency when composing messages. You can use your computer’s cut-and-paste function, you can have managed address books and lists, when replying to another message you can automatically incorporate any part of the message to which you are, etc. And it is just as easy to send a message to one as to a whole list of addressees. (Admittedly, this results in a lot of abuse and information overload on the recipient’s side.)
  • Digital: e-mail is composed on a computer and remains in computer-readable format all the way to its destination. Thus one can also easily incorporate other computer data such as graphics or document files.
  • Cheap: because the capacity of the Internet and disk storage is increasing all the time, the cost of a sending and storing a one-page e-mail message is negligible.
  • Speed: messages are generally delivered across the world in a matter of seconds.

The e-mail address

Just like with ordinary postal mail (now usually referred to as snail-mail), you need to know the recipient’s address before you can send your message. Internet e-mail addresses have a standard format: username@domain. The username is often the name that your addressee uses to connect to the network, e.g. “jvanbelle” or sometimes a long number. This username is allocated by the LAN administrator. The domain identifies the file server, which acts as the local post office for your recipient’s e-mail. The domain consists of several parts, separated by full stops or dots. The international standard for domain identification is ….

  • The country code is the international two-letter code for the country (e.g., au for Australia, za for South Africa, sa for Saudi-Arabia, uk for Great Britain, etc).
  • The two most common types of organizations are co for a commercial organization and ac for an academic institution . Less frequent are org for (not-for-profit) organizations, mil for military, net for networks and gov for government agencies.
  • Each country has a national Internet naming body that allows its organizations to chose their own name, as long as no one has claimed the same name before. Examples of South African domain names are anc.org.za, uct.ac.za, fnb.co.za.
  • Large organizations often refine the domain further by adding the name of their LAN servers, e.g mail.uct.ac.za.

Examples of possible e-mail addresses are: JaneDoe@stats.uct.ac.za (Jane working in the statistics department at the University of Cape Town in South Africa); info@anc.org.za (information department at the ANC, a political party) or SoapJoe@marketing.bt.co.uk (Joe Soap in the marketing department of British Telekom in the U.K.).

The US Americans, having “invented” the Internet, use a slightly different way for their addresses. They leave off their country code (us) and use com for commercial organization or edu for educational institution. Since the majority of Internet users hail from the US, you will encounter many addresses such as JWood@mit.edu or Bill@microsoft.com.

Netiquette

Just as in any other social interaction environment, there are some rules and guidelines for appropriate social behavior on the Net: Netiquette (etiquette on the Internet). The following are some illustrative examples pertaining primarily to e-mail.

  • Shouting, THE PRACTICE OF TYPING ENTIRE SENTENCES IN UPPER CASE, is generally seen as novice (newbie) behavior and frowned upon. Perhaps it stems from the disgust with old teletypes and mainframe terminals that did not have lower-case characters.
    Screenshot_8_7_15__4_23_PM
  • The use of emoticons to indicate the emotive content of a sentence is highly recommended. Typed text does not reveal any body language and a joking remark can easily be interpreted the wrong way. Whenever one writes something in jest or with humorous intent, it is advisable to add an emoticon. An emoticon (an icon indicating emotional content) consists of a series of text characters which are meant to be rotated a quarter turn and represent a laughing :-) (i.e. equivalent to J or the smiley) winking ;-) or sad face :-(
  • Flaming is the carrying on of a heated personal emotional debate between two or more individuals on a public Internet forum. A flame war is generally a sign of immature behavior by individuals who cannot take perspective and should really take the discussion off-line.
  • Netizens (inhabitants of the Internet i.e. frequent net surfers) often use standard but,
    to the non-initiated, cryptic abbreviations. Examples are: BTW = by the way ; ROFL = rolling on the floor with laughter ; TPTB = the powers that be; BRB = Be Right Back. This vocabulary has been adopted and expanded with the growth of Short Message Service (SMS) use on cellular phones.

The Web

The Internet service that has received the most attention from the public media is the WorldWide Web or the Web for short (sometimes also called WWW or W3). The Web is a vast collection of multimedia information located on Web servers attached to the Internet.

Its popularity is due to a number of reasons.

  • Information links are transparent. Links to any other piece of information located anywhere on the Internet can be inserted in a web document. A simple click of the mouse takes the reader completely automatically from one Web server to another, quite possibly in another country.
  • Information can be presented in a hypertext link format whereby one can jump immediately from one concept to a related concept or explanation. No need to read text in the traditional top-to-bottom sequential way.
  • It allows for multimedia information. A Web document can incorporate rich and colourful graphics, animation, video clips, sound etc. Just think of the marketing opportunities!
  • The Web supports interactive applications. Web applications can request information from visiting users and documents can include programming instructions. Users can even download small programs (often written in Java) that could perform some processing on the user’s computer or display special visual effects.

Graphic showing a surfer. Part of a Web address is visible in the background.

Reading or accessing information on the Web is called surfing the Net because one jumps from one hypertext link to another following whatever takes your fancy. In order to surf the Net you need some special browser program that understands the Web protocols and formats and presents the information to suit your computer monitor. You also need an access point or connection to the Internet. Your Internet connection may be automatic if your computer is connected to a (corporate) LAN that connects directly to the Internet, or it may be by means of a special subscription to a business that specializes in providing Internet access for others: the Internet Service Provider (ISP). Access to the ISP for individual users is usually via a dial-up connection i.e. using a modem and telephone.

Once a newbie (new user) is connected to the Internet (online), she faces the daunting task of finding her way amongst the huge variety of information offered. The easiest way in is usually by means of a search engine: a Web site that tries to catalogue the information available on the Internet. By entering one or more search words, the engine will provide you with a couple of adverts and a list of documents that contain the word(s) for which you are looking.

All information on the Web is uniquely identified by the URL (Uniform Resource Location), which is really the full Internet address of a Web document. The URL consists again of the Web server’s domain address, followed by the access path and file name on the server. Examples of URLs are www.hotbot.com/sports/main.html (the main page on the sports section of the HotBot search engine) or http://www.commerce.uct.ac.za/informationsystems/ (containing details about UCT’s department of information systems). Note the similarities and differences between an URL and an e-mail address.

Other Internet services

A number of other services are available on the Internet. The Usenet consists of ongoing discussion fora (or newsgroups) on an extremely wide variety of topics, from forensic psychology to Douglas Adams, from Star Trek to cryptography. The discussion happens entirely by means of e-mail and, when you subscribe to a given newsgroup, you can browse through the contributions of the last few days and reply with your own contribution.

More specialized services exist, such as ftp (file transfer protocol) for the transfer of large computer files, and telnet, the remote access of computers elsewhere, but they are used less frequently. In any way, these services are now being performed transparently by most Web browsers. Similarly, older services such as Gopher and Veronica have really been replaced almost entirely by the Web.

Internet protocols and standards

Different computers and networks can communicate via the Internet because a number of basic Internet communication standards have been defined. Any network connected to the Internet will translate its own standards and protocols into those used on the Internet by means of a bridge.

The most fundamental and “lowest level” protocol is the TCP/IP (Transmission Control Protocol/Internet Protocol). This protocol is also the native protocol of computers using the Unix operating system, which explains why Unix computers are so popular as Internet servers.

On top of TCP/IP are the “mid-level” protocols defined for the various Internet services. Perhaps the best known of these is http (Hypertext Transmission Protocol), which specifies how the Web information is made available and transmitted across the Internet. Other protocols and standards are STMP and MIME (for e-mail) or ftp.

Photo of computer screen showing HTML

 

HTML

Information made available via the Web is usually formatted using a special standard: the Hypertext Markup Language (HTML), which actually consists of plain text files with visual formatting commands inserted between the text. Most desktop productivity software allows you to save your document directly in the HTML format. Special HTML editors allow much finer control over the final layout of your Web document. A later development is Extensible Markup Language (XML), which increases the flexibility of web documents by allowing them to be viewed not only using a web browser, but also on different platforms such as a PDA or cellular telephone.

 

Reading: E-Commerce

Introduction

Graphic of a green Internet shopping cart

Business was quick to grasp the marketing and business potential offered by the Internet. Initially, businesses used the Internet to facilitate communication by means of e-mail. This was quickly followed by tapping the web’s potential for the dissemination of product and other marketing information. The provision of advertising space {banners) on frequently visited web sites is the main source of income for search engines (sites allowing you to search the Internet for information) and web portals (web sites that provide additional value-added personal services such as news, financial information, weather forecasts, items of interest etc.)

A number of specialized companies have realized that the Internet can be a direct and extremely cost-effective channel of distribution. Some companies already have a physical infrastructure and use the web to enhance their distribution channel e.g. you can now order your pizza, bank statements or movie tickets via the web. Other, virtual companies have almost no physical infrastructure and are mere “conductors” for the flow of products of services.

Important categories of e-commerce include:

  • Business-to-consumer (B2C) in which organizations provide information online to customers, who can in turn place orders and make payments via the internet
  • Business-to-business (B2B) in which business partners collaborate electronically
  • Consumer-to-consumer (C2C) in which individuals sell products or services directly to other individuals.

The technologies that are needed to support electronic commerce include the network infrastructure (Internet, intranets, extranets), software tools for web site development and maintenance, secure ordering and payment methods, and resources for information sharing, communication and collaboration. When e-commerce is done in a wireless environment, such as through the use of cellphones, this is referred to as mobile commerce (m-commerce).

B2C e-Commerce

Electronic retailing is similar in principle to home shopping from catalogues, but offers a wider variety of products and services, often at lower prices. Search engines make it easy to locate and compare competitor’s products from one convenient location and without being restricted to usual shopping hours. Electronic malls provide access to a number of individual shops from one website. On-line auctions have also proved a popular way of disposing of items that need a quick sale.

Business-to-consumer commerce allows customers to make enquiries about products, place orders, pay accounts, and obtain service support via the Internet. Since customers can enter transactions at any time of the day or night, and from any geographical location, this can be a powerful tool for expanding the customer base of a business. However, the existence of a website does not guarantee that customers will use it, or that they will return to it after a first visit. Firms investing in electronic commerce need to consider a number of factors in developing and maintaining their e-commerce sites.

A successful web site should be attractive to look at and easy to use. In addition, it should offer its customers good performance, efficient service, personalization, incentives to purchase and security. Inadequate server power and communications capacity may cause customers to become frustrated when browsing or selecting products.

Many sites record details of their customers’ interests, so that they can be guided to the appropriate parts of the site. Customer loyalty can also be developed by offering discussion forums and links to related sites, and by providing incentives such as discounts and special offers for regular customers. And if you expect customers to purchase goods, and not just browse, then it is vital that customers should have complete confidence in the security of their personal information, and in the ability of the web store to deliver the goods as requested.

Much of the business value of the Internet lies in the ability to provide increased value to customers, with the focus on quality of service rather than simply price. By opening additional channels of communication between the business and its customers, businesses can find out the preferences of their customers, and tailor products to their needs. Customers can use the Internet to ask questions, air complaints, or request product support, which increase customer involvement in business functions such as product development and service.

However, although businesses may increase their markets while gaining from reduced advertising and administration costs, problems that have emerged include alienation of regular distributors, difficulty in shipping small orders over large distances, fierce competition and inadequate profit margins. Because of the delivery problem for physical products, many successful e-commerce firms have focused on the delivery of services, such as banking, securities trading, employment agencies and travel bureaus. Of course, every problem can be regarded as an opportunity – a local software developer has created and marketed a route scheduling application which provides optimized route sheets, with maps for individual routes and step-by-step driving instructions for effective and timeous order management, based on powerful geographical information systems to provide a user-friendly interface.

B2C e-commerce has also made it easier for firms to conduct market research, not only by collecting shopping statistics, but also by using questionnaires to find out what specific groups of customers want. This in turn has enabled the personalization of products to meet customer preferences.

B2B e-Commerce

Photo of a handful of brightly colored paperclips against a black background.

Business-to-business e-commerce comprises the majority of electronic transactions, involving the supply chain between organizations and their distributors, resellers, suppliers and other partners. Efficient management of the supply chain can cut costs, increase profits, improve relationships with customers and suppliers, and gain competitive advantage. To achieve this, firms need to

  • Get the right product to the right place at the least cost;
  • Keep inventory as low as possible while meeting customer requirements;
  • Reduce cycle times by speeding up the acquisition and processing of raw materials.

Information technologies used to support business-to-business e-commerce include email, EDI and EFT, product catalogues, and order processing systems. These functions may be linked to traditional accounting and business information systems, to ensure that inventory and other databases are automatically updated via web transactions. Intranets provide a facility for members of an organization to chat, hold meetings and exchange information, while at the same time sensitive information is protected from unauthorized access by means of a firewall. An extranet provides a means of access to the intranet for authorized users such as business consultants.

Electronic data interchange (EDI) involves the electronic exchange of business transaction documents over computer networks, between organizations and their customers or suppliers. Value-added networks provided by third parties are frequently used for this purpose. Documents such as purchase orders, invoices and requests for quotations are electronically interchanged using standard message formats, which are specified by international protocols. EDI eliminates printing, postage and manual handling of documents, reducing time delays and errors, and thus increasing productivity. It also provides support for implementing a Justin-Time approach, which reduces lead time, lowers inventory levels, and frees capital for the business.

Marketing to other businesses is done by means of electronic catalogues and auction sites, which can increase sales while reducing advertising and administrative costs. From the buyer’s perspective, reverse auctions can be used to advertise requests for quotation in a bidding marketplace in order to attract potential suppliers. Third party vendors can make use of group purchasing to aggregate a number of separate small orders in order to increase negotiating power.

Collaborative commerce involves long-term relationships between organizations in areas such as demand forecasting, inventory management, and product design and manufacture. However, this presents a number of business challenges such as software integration, compatibility of technologies, and building of trust between firms.

C2C e-Commerce

Auctions are the most popular method of conducting business between individuals over the Internet. (Unfortunately, auction fraud was also the most common type of crime reported to the Internet Fraud Complaint Centre in 2002.) Other C2C activities include classified advertising, selling of personal services such as astrology and medical advice, and the exchange of files especially music and computer games.

Electronic funds transfer

Electronic payment systems can be used to transfer funds between the bank accounts of a business and its suppliers, or from a customer to the business. In retail stores, wide area networks may connect POS terminals in retail stores to bank EFT systems. In most cases, an intermediary organization acts as an automated clearinghouse, which debits and credits the relevant accounts.

The most popular payment method used by individual consumers is the credit card, which requires the merchant to pay a commission to the bank on each transaction. For transactions involving small amounts that do not justify the payment of commission, merchants may accept electronic money in the form of digital cash In this case, the customer “buys” money from the bank in the form of a unique cash number, which is transmitted to the merchant at the time of purchase and “deposited” in an account at a participating bank. In South Africa several banks have developed their own forms of digital cash, such as e-bucks from First National Bank.

Screen shot of online payment options: Visa, Master Card, American Express, Discover, and PayPal.

An important issue in electronic commerce is the security of Internet transactions. Data is commonly encrypted to reduce the vulnerability of credit card transactions. Secure Sockets Layer (SSL) and Secure Electronics Transaction (SET) are two of the standards used to secure electronic payments on the Internet. Secure sites usually have URLs that begin with https instead of the usual http.

Current Issues in e-Commerce

For e-commerce to succeed, companies need to make large investments in hardware and telecommunications infrastructures that will be up and running 100% of the time, and software that is easy to use and reliable. A number of early participants in the e-commerce market suffered financial losses because their technology was not able to handle the huge numbers of transactions to be processed. Internet customers are often impatient, and will move to a competing site if the response is too slow.

Gaining the trust of customers can be difficult—the seller is often reluctant to despatch goods before payment, and the buyer may be reluctant to pay before receiving the goods. In South Africa, the speed of electronic ordering is often negated by delays in physical delivery.

Societal problems have also emerged, with children, gamblers and shopping addicts enjoying unrestricted access to electronic commerce sites. A German cannibal posted a web advertisement seeking a victim who was willing to be killed, sliced and eaten – and apparently found one! (reported in www.iol.co.za, 18 December 2002). The laws governing electronic commerce are still in their infancy, and international standards need to be developed in areas such as information privacy and taxation.

Since e-commerce supports global business transactions, it presents the challenge of customizing web sites to appeal to people of different nationalities and cultures (and even different languages). South Africa’s leading role in e-commerce in Africa can probably be attributed to the fact that it has a relatively advanced telecommunication infrastructure and a large number of English-speaking users.

South African Perspective

Because a website can easily be developed as a front for a fraudulent company, businesses need a way to guarantee their authenticity to potential customers. Thawte Consulting, the company established by Mark Shuttleworth after graduating from UCT and later sold to market leader Verisign, provided (among other products) digital certificates which serve two purposes: to ensure that no sensitive information can be viewed by unauthorized users, and to provide users with assurance regarding the ownership of the site. By providing certificates at a lower cost than its competitors, but with similar technological and security standards, Thawte rapidly established itself as the second largest provider of digital certificates.

These non-forgeable Secure Sockets Layer (SSL) certificates are issued and digitally signed by a company such as Thawte, which has verified that the website really is owned by the organization requesting the certificate. Once the digital certificate has been installed on the site, the SSL uses complex encryption techniques to scramble confidential information.

Beyond the Basics

Encryption is the process of converting readable data into unreadable characters to prevent unauthorised access. Encrypted data can be safely transmitted or stored, but must be decrypted before it can be read, by using an encryption key, which is sometimes the same formula that was used to scramble it in the first place. Simple encryption methods include:

  • Transposition: in which the order of characters is switched, for example each pair of adjacent characters is swapped.
  • Substitution: in which each character is replaced by some other predetermined character.
  • Expansion: additional letters are inserted after each of the characters in the original text.
  • Compaction: characters are removed from specific positions and then stored or transmitted separately.

Most encryption programs use a combination of all four methods.

Private key encryption relies on both sender and recipient having access to the same encryption key. Public key encryption makes use of two keys: a message encrypted with your public key can only be decrypted using your private key. This means that you can safely communicate your public key to business contacts, who are then able to send you confidential data that can only be read using your private key. Security agencies in the United States have

lobbied for some time for private keys to be independently stored, so that encrypted communications could be monitored when national security is considered to be at risk.

Reflection Questions

B2C e-commerce

  • What advantages does the customer stand to gain from B2C e-commerce, compared with traditional business models?
  • Can you think of any potential disadvantages?

C2C e-commerce

  • Give reasons why internet auctions are a common source of fraud, and suggest control structures that could be put in place to reduce this problem.

B2B e-commerce

  • Explain how B2B e-commerce could contribute to each of the alternative strategies for competitive advantage (low-cost, differentiation, niche marketing) that were described in the previous chapter.
 
 

Introduction

The Internet is a global system of interconnected computer networks that use the standard Internet protocol suite (TCP/IP) to link several billion devices worldwide. It is a network of networks that consists of millions of private, public, academic, business, and government networks of local to global scope, linked by a broad array of electronic, wireless, and optical networking technologies.

Sculpture of a man with an "@" symbol head and wings, riding on a bike with @ wheels.

 

The Internet Messenger by Buky Schwartz in Holon.

The Internet carries an extensive range of information resources and services, such as the inter-linked hypertext documents and applications of the World Wide Web (WWW), the infrastructure to support email, and peer-to-peer networks for file sharing and telephony.

The origins of the Internet date back to research commissioned by the United States government in the 1960s to build robust, fault-tolerant communication via computer networks. This work, combined with efforts in the United Kingdom and France, led to the primary precursor network, the ARPANET, in the United States. The interconnection of regional academic networks in the 1980s marks the beginning of the transition to the modern Internet. From the early 1990s, the network experienced sustained exponential growth as generations of institutional, personal, and mobile computers were connected to it.

The funding of a new U.S. backbone by the National Science Foundation in the 1980s, as well as private funding for other commercial backbones, led to worldwide participation in the development of new networking technologies, and the merger of many networks. Though the Internet has been widely used by academia since the 1980s, the commercialization of what was by the 1990s an international network resulted in its popularization and incorporation into virtually every aspect of modern human life. As of 2014, 38 percent of the world’s human population has used the services of the Internet within the past year–over 100 times more people than were using it in 1995. Internet use grew rapidly in the West from the mid-1990s to early 2000s and from the late 1990s to present in the developing world.

Most traditional communications media, including telephony and television, are being reshaped or redefined by the Internet, giving birth to new services such as voice over Internet Protocol (VoIP) and Internet Protocol television (IPTV). Newspaper, book, and other print publishing are adapting to website technology, or are reshaped into blogging and web feeds. The entertainment industry, including music, film, and gaming, was initially the fastest growing online segment. The Internet has enabled and accelerated new forms of human interactions through instant messaging, Internet forums, and social networking. Online shopping has grown exponentially both for major retailers and small artisans and traders. Business-to-business and financial services on the Internet affect supply chains across entire industries.

The Internet has no centralized governance in either technological implementation or policies for access and usage; each constituent network sets its own policies. Only the overreaching definitions of the two principal name spaces in the Internet, the Internet Protocol address space and the Domain Name System (DNS), are directed by a maintainer organization, the Internet Corporation for Assigned Names and Numbers (ICANN). The technical underpinning and standardization of the core protocols is an activity of the Internet Engineering Task Force (IETF), a non-profit organization of loosely affiliated international participants that anyone may associate with by contributing technical expertise.

Terminology

The Internet, referring to the specific global system of interconnected IP networks, is a proper noun and may be written with an initial capital letter. In the media and common use it is often not capitalized, viz. the internet. Some guides specify that the word should be capitalized when used as a noun, but not capitalized when used as an adjective. The Internet is also often referred to as the Net.

Historically the word internetted was used, uncapitalized, as early as 1849 as an adjective meaning “Interconnected; interwoven”. The designers of early computer networks used internet both as a noun and as a verb in shorthand form of internetwork or internetworking, meaning interconnecting computer networks.

The terms Internet and World Wide Web are often used interchangeably in everyday speech; it is common to speak of “going on the Internet” when invoking a web browser to view web pages. However, the World Wide Webor the Web is only one of a large number of Internet services. The Web is a collection of interconnected documents (web pages) and other web resources, linked by hyperlinks and URLs. As another point of comparison, Hypertext Transfer Protocol, or HTTP, is the language used on the Web for information transfer, yet it is just one of many languages or protocols that can be used for communication on the Internet.

The term Interweb is a portmanteau of Internet and World Wide Web typically used sarcastically to parody a technically unsavvy user.

History

Photo of text from the very first message ever sent via the arpanet.

 

Text from the very first message ever sent via the ARPANET.

Research into packet switching started in the early 1960s and packet switched networks such as Mark I at NPL in the UK, ARPANET, CYCLADES, Merit Network, Tymnet, and Telenet, were developed in the late 1960s and early 1970s using a variety of protocols. The ARPANET in particular led to the development of protocols for internet working, where multiple separate networks could be joined together into a network of networks.

The first two nodes of what would become the ARPANET were interconnected between Leonard Kleinrock’s Network Measurement Center at the UCLA’s School of Engineering and Applied Science and Douglas Engelbart’s NLS system at SRI International (SRI) in Menlo Park, California, on 29 October 1969. The third site on the ARPANET was the Culler-Fried Interactive Mathematics center at the University of California at Santa Barbara, and the fourth was the University of UtahGraphics Department. In an early sign of future growth, there were already fifteen sites connected to the young ARPANET by the end of 1971. These early years were documented in the 1972 film Computer Networks: The Heralds of Resource Sharing.

Early international collaborations on the ARPANET were rare. European developers were concerned with developing the X.25networks. Notable exceptions were the Norwegian Seismic Array (NORSAR) in June 1973, followed in 1973 by Sweden with satellite links to the Tanum Earth Station and Peter T. Kirstein’s research group in the UK, initially at the Institute of Computer Science, University of London and later at University College London.

In December 1974, RFC 675 – Specification of Internet Transmission Control Program, by Vinton Cerf, Yogen Dalal, and Carl Sunshine, used the term internet as a shorthand for internet working and later RFCs repeat this use. Access to the ARPANET was expanded in 1981 when the National Science Foundation (NSF) developed the Computer Science Network (CSNET). In 1982, the Internet Protocol Suite (TCP/IP) was standardized and the concept of a world-wide network of fully interconnected TCP/IP networks called the Internet was introduced.

Map of NSFNET networks across the U.S. in 1992.

 

T3 NSFNET Backbone, c. 1992.

TCP/IP network access expanded again in 1986 when the National Science Foundation Network (NSFNET) provided access to supercomputer sites in the United States from research and education organizations, first at 56 kbit/s and later at 1.5 Mbit/s and 45 Mbit/s. Commercial Internet service providers (ISPs) began to emerge in the late 1980s and early 1990s. The ARPANET was decommissioned in 1990. The Internet was fully commercialized in the U.S. by 1995 when NSFNET was decommissioned, removing the last restrictions on the use of the Internet to carry commercial traffic. The Internet started a rapid expansion to Europe and Australia in the mid to late 1980s and to Asia in the late 1980s and early 1990s.

Since the mid-1990s the Internet has had a tremendous impact on culture and commerce, including the rise of near instant communication by email, instant messaging, Voice over Internet Protocol (VoIP) “phone calls”, two-way interactive video calls, and the World Wide Web with its discussion forums, blogs, social networking, and online shopping sites. Increasing amounts of data are transmitted at higher and higher speeds over fiber optic networks operating at 1-Gbit/s, 10-Gbit/s, or more.

Worldwide Internet users
200520102014a
World population6.5 billion6.9 billion7.2 billion
Not using the Internet84%70%60%
Using the Internet16%30%40%
Users in the developing world8%21%32%
Users in the developed world51%67%78%
aEstimate. Source: International Telecommunications Union.

The Internet continues to grow, driven by ever greater amounts of online information and knowledge, commerce, entertainment and social networking. During the late 1990s, it was estimated that traffic on the public Internet grew by 100 percent per year, while the mean annual growth in the number of Internet users was thought to be between 20% and 50%. This growth is often attributed to the lack of central administration, which allows organic growth of the network, as well as the non-proprietary open nature of the Internet protocols, which encourages vendor interoperability and prevents any one company from exerting too much control over the network. As of 31 March 2011, the estimated total number of Internet users was 2.095 billion (30.2% of world population). It is estimated that in 1993 the Internet carried only 1% of the information flowing through two-way telecommunication, by 2000 this figure had grown to 51%, and by 2007 more than 97% of all telecommunicated information was carried over the Internet.

Governance

 

ICANN headquarters in the Playa Vista neighborhood of Los Angeles, California, United States.

The Internet is a globally distributed network comprising many voluntarily interconnected autonomous networks. It operates without a central governing body.

The technical underpinning and standardization of the core protocols (IPv4 and IPv6) is an activity of the Internet Engineering Task Force (IETF), a non-profit organization of loosely affiliated international participants that anyone may associate with by contributing technical expertise.

To maintain interoperability, the principal name spaces of the Internet are administered by the Internet Corporation for Assigned Names and Numbers (ICANN), headquartered in the neighborhood of Playa Vista in the city of Los Angeles, California. ICANN is the authority that coordinates the assignment of unique identifiers for use on the Internet, including domain names, Internet Protocol (IP) addresses, application port numbers in the transport protocols, and many other parameters. Globally unified name spaces, in which names and numbers are uniquely assigned, are essential for maintaining the global reach of the Internet. ICANN is governed by an international board of directors drawn from across the Internet technical, business, academic, and other non-commercial communities. ICANN’s role in coordinating the assignment of unique identifiers distinguishes it as perhaps the only central coordinating body for the global Internet.

Regional Internet Registries (RIRs) allocate IP addresses:

  • African Network Information Center (AfriNIC) for Africa
  • American Registry for Internet Numbers (ARIN) for North America
  • Asia-Pacific Network Information Centre (APNIC) for Asia and the Pacific region
  • Latin American and Caribbean Internet Addresses Registry (LACNIC) for Latin America and the Caribbean region
  • Réseaux IP Européens – Network Coordination Centre (RIPE NCC) for Europe, the Middle East, and Central Asia

The National Telecommunications and Information Administration, an agency of the United States Department of Commerce, continues to have final approval over changes to the DNS root zone.

The Internet Society (ISOC) was founded in 1992 with a mission to “assure the open development, evolution and use of the Internet for the benefit of all people throughout the world”. Its members include individuals (anyone may join) as well as corporations, organizations, governments, and universities. Among other activities ISOC provides an administrative home for a number of less formally organized groups that are involved in developing and managing the Internet, including: the Internet Engineering Task Force (IETF), Internet Architecture Board (IAB), Internet Engineering Steering Group (IESG), Internet Research Task Force (IRTF), and Internet Research Steering Group (IRSG).

On 16 November 2005, the United Nations-sponsored World Summit on the Information Society, held in Tunis, established the Internet Governance Forum(IGF) to discuss Internet-related issues.

Infrastructure

The communications infrastructure of the Internet consists of its hardware components and a system of software layers that control various aspects of the architecture.

Routing and service tiers

Graphic showing tiers of internet service providers.

 

Packet routing across the Internet involves several tiers of Internet service providers.

Internet service providers establish the world-wide connectivity between individual networks at various levels of scope. End-users who only access the Internet when needed to perform a function or obtain information, represent the bottom of the routing hierarchy. At the top of the routing hierarchy are the tier 1 networks, large telecommunication companies that exchange traffic directly with each other via peering agreements. Tier 2 and lower level networks buyInternet transit from other providers to reach at least some parties on the global Internet, though they may also engage in peering. An ISP may use a single upstream provider for connectivity, or implement multihoming to achieve redundancy and load balancing. Internet exchange points are major traffic exchanges with physical connections to multiple ISPs.

Large organizations, such as academic institutions, large enterprises, and governments, may perform the same function as ISPs, engaging in peering and purchasing transit on behalf of their internal networks. Research networks tend to interconnect with large subnetworks such as GEANT, GLORIAD, Internet2, and the UK’s national research and education network, JANET.

It has been determined that both the Internet IP routing structure and hypertext links of the World Wide Web are examples of scale-free networks.

Computers and routers use routing tables in their operating system to direct IP packets to the next-hop router or destination. Routing tables are maintained by manual configuration or automatically by routing protocols. End-nodes typically use a default route that points toward an ISP providing transit, while ISP routers use the Border Gateway Protocol to establish the most efficient routing across the complex connections of the global Internet.

Access

Common methods of Internet access by users include dial-up with a computer modem via telephone circuits, broadband over coaxial cable, fiber optic or copper wires, Wi-Fi, satellite and cellular telephone technology (3G, 4G). The Internet may often be accessed from computers in libraries and Internet cafes.Internet access points exist in many public places such as airport halls and coffee shops. Various terms are used, such as public Internet kioskpublic access terminal, and Web payphone. Many hotels also have public terminals, though these are usually fee-based. These terminals are widely accessed for various usage, such as ticket booking, bank deposit, or online payment. Wi-Fi provides wireless access to the Internet via local computer networks. Hotspots providing such access include Wi-Fi cafes, where users need to bring their own wireless-enabled devices such as a laptop or PDA. These services may be free to all, free to customers only, or fee-based.

Grassroots efforts have led to wireless community networks. Commercial Wi-Fi services covering large city areas are in place in London, Vienna, Toronto, San Francisco, Philadelphia, Chicago and Pittsburgh. The Internet can then be accessed from such places as a park bench. Apart from Wi-Fi, there have been experiments with proprietary mobile wireless networks like Ricochet, various high-speed data services over cellular phone networks, and fixed wireless services. High-end mobile phones such as smartphones in general come with Internet access through the phone network. Web browsers such as Opera are available on these advanced handsets, which can also run a wide variety of other Internet software. More mobile phones have Internet access than PCs, though this is not as widely used. An Internet access provider and protocol matrix differentiates the methods used to get online.

Protocols

While the hardware components in the Internet infrastructure can often be used to support other software systems, it is the design and the standardization process of the software that characterizes the Internet and provides the foundation for its scalability and success. The responsibility for the architectural design of the Internet software systems has been assumed by the Internet Engineering Task Force (IETF). The IETF conducts standard-setting work groups, open to any individual, about the various aspects of Internet architecture. Resulting contributions and standards are published as Request for Comments (RFC) documents on the IETF web site.

The principal methods of networking that enable the Internet are contained in specially designated RFCs that constitute the Internet Standards. Other less rigorous documents are simply informative, experimental, or historical, or document the best current practices (BCP) when implementing Internet technologies.

The Internet standards describe a framework known as the Internet protocol suite. This is a model architecture that divides methods into a layered system of protocols, originally documented in RFC 1122 and RFC 1123. The layers correspond to the environment or scope in which their services operate. At the top is the application layer, the space for the application-specific networking methods used in software applications. For example, a web browser program uses the client-server application model and a specific protocol of interaction between servers and clients, while many file-sharing systems use a peer-to-peer paradigm. Below this top layer, the transport layer connects applications on different hosts with a logical channel through the network with appropriate data exchange methods.

Underlying these layers are the networking technologies that interconnect networks at their borders and hosts via the physical connections. The internet layeren ables computers to identify and locate each other via Internet Protocol (IP) addresses, and routes their traffic via intermediate (transit) networks. Last, at the bottom of the architecture is the link layer, which provides connectivity between hosts on the same network link, such as a physical connection in form of a local area network (LAN) or a dial-up connection. The model, also known as TCP/IP, is designed to be independent of the underlying hardware, which the model therefore does not concern itself with in any detail. Other models have been developed, such as the OSI model, that attempt to be comprehensive in every aspect of communications. While many similarities exist between the models, they are not compatible in the details of description or implementation; indeed, TCP/IP protocols are usually included in the discussion of OSI networking.

 

As user data is processed through the protocol stack, each abstraction layer adds encapsulation information at the sending host. Data is transmitted over the wire at the link level between hosts and routers. Encapsulation is removed by the receiving host. Intermediate relays update link encapsulation at each hop, and inspect the IP layer for routing purposes.

The most prominent component of the Internet model is the Internet Protocol (IP), which provides addressing systems (IP addresses) for computers on the Internet. IP enables internet working and in essence establishes the Internet itself. Internet Protocol Version 4 (IPv4) is the initial version used on the first generation of the Internet and is still in dominant use. It was designed to address up to ~4.3 billion (109) Internet hosts. However, the explosive growth of the Internet has led to IPv4 address exhaustion, which entered its final stage in 2011, when the global address allocation pool was exhausted. A new protocol version, IPv6, was developed in the mid-1990s, which provides vastly larger addressing capabilities and more efficient routing of Internet traffic. IPv6 is currently in growing deployment around the world, since Internet address registries (RIRs) began to urge all resource managers to plan rapid adoption and conversion.

IPv6 is not directly interoperable by design with IPv4. In essence, it establishes a parallel version of the Internet not directly accessible with IPv4 software. This means software upgrades or translator facilities are necessary for networking devices that need to communicate on both networks. Essentially all modern computer operating systems support both versions of the Internet Protocol. Network infrastructure, however, is still lagging in this development. Aside from the complex array of physical connections that make up its infrastructure, the Internet is facilitated by bi- or multi-lateral commercial contracts, e.g., peering agreements, and by technical specifications or protocols that describe how to exchange data over the network. Indeed, the Internet is defined by its interconnections and routing policies.

Services

The Internet carries many network services, most prominently the World Wide Web, electronic mail, Internet telephony, and File sharing services.

World Wide Web

Photo of Tim Berners-Lee's computer.

 

This NeXT Computer was used by Tim Berners-Lee at CERN and became the world’s first Web server.

Many people use the terms Internet and World Wide Web, or just the Web, interchangeably, but the two terms are not synonymous. The World Wide Web is only one of hundreds of services used on the Internet. The Web is a global set of documents, images and other resources, logically interrelated by hyperlinks and referenced with Uniform Resource Identifiers (URIs). URIs symbolically identify services, servers, and other databases, and the documents and resources that they can provide. Hypertext Transfer Protocol (HTTP) is the main access protocol of the World Wide Web. Web services also use HTTP to allow software systems to communicate in order to share and exchange business logic and data.

World Wide Web browser software, such as Microsoft’s Internet Explorer, Mozilla Firefox, Opera, Apple’s Safari, and Google Chrome, lets users navigate from one web page to another via hyperlinks embedded in the documents. These documents may also contain any combination of computer data, including graphics, sounds, text, video, multimedia and interactive content that runs while the user is interacting with the page. Client-side software can include animations, games, office applications and scientific demonstrations. Through keyword-driven Internet research using search engines like Yahoo! and Google, users worldwide have easy, instant access to a vast and diverse amount of online information. Compared to printed media, books, encyclopedias and traditional libraries, the World Wide Web has enabled the decentralization of information on a large scale.

The Web has also enabled individuals and organizations to publish ideas and information to a potentially large audience online at greatly reduced expense and time delay. Publishing a web page, a blog, or building a website involves little initial cost and many cost-free services are available. However, publishing and maintaining large, professional web sites with attractive, diverse and up-to-date information is still a difficult and expensive proposition. Many individuals and some companies and groups use web logs or blogs, which are largely used as easily updatable online diaries. Some commercial organizations encouragestaff to communicate advice in their areas of specialization in the hope that visitors will be impressed by the expert knowledge and free information, and be attracted to the corporation as a result.

One example of this practice is Microsoft, whose product developers publish their personal blogs in order to pique the public’s interest in their work. Collections of personal web pages published by large service providers remain popular, and have become increasingly sophisticated. Whereas operations such as Angelfire and GeoCities have existed since the early days of the Web, newer offerings from, for example, Facebook and Twitter currently have large followings. These operations often brand themselves as social network services rather than simply as web page hosts.

Advertising on popular web pages can be lucrative, and e-commerce or the sale of products and services directly via the Web continues to grow.

When the Web developed in the 1990s, a typical web page was stored in completed form on a web server, formatted in HTML, complete for transmission to a web browser in response to a request. Over time, the process of creating and serving web pages has become dynamic, creating flexible design, layout, and content. Websites are often created using content management software with, initially, very little content. Contributors to these systems, who may be paid staff, members of an organization or the public, fill underlying databases with content using editing pages designed for that purpose, while casual visitors view and read this content in HTML form. There may or may not be editorial, approval and security systems built into the process of taking newly entered content and making it available to the target visitors.

Communication

Email is an important communications service available on the Internet. The concept of sending electronic text messages between parties in a way analogous to mailing letters or memos predates the creation of the Internet. Pictures, documents and other files are sent as email attachments. Emails can be cc-ed to multiple email addresses.

Internet telephony is another common communications service made possible by the creation of the Internet. VoIP stands for Voice-over-Internet Protocol, referring to the protocol that underlies all Internet communication. The idea began in the early 1990s with walkie-talkie-like voice applications for personal computers. In recent years many VoIP systems have become as easy to use and as convenient as a normal telephone. The benefit is that, as the Internet carries the voice traffic, VoIP can be free or cost much less than a traditional telephone call, especially over long distances and especially for those with always-on Internet connections such as cable or ADSL. VoIP is maturing into a competitive alternative to traditional telephone service. Interoperability between different providers has improved and the ability to call or receive a call from a traditional telephone is available. Simple, inexpensive VoIP network adapters are available that eliminate the need for a personal computer.

Voice quality can still vary from call to call, but is often equal to and can even exceed that of traditional calls. Remaining problems for VoIP include emergency telephone number dialing and reliability. Currently, a few VoIP providers provide an emergency service, but it is not universally available. Older traditional phones with no “extra features” may be line-powered only and operate during a power failure; VoIP can never do so without a backup power source for the phone equipment and the Internet access devices. VoIP has also become increasingly popular for gaming applications, as a form of communication between players. Popular VoIP clients for gaming include Ventrilo and Teamspeak. Modern video game consoles also offer VoIP chat features.

Data transfer

File sharing is an example of transferring large amounts of data across the Internet. A computer file can be emailed to customers, colleagues and friends as an attachment. It can be uploaded to a website or FTP server for easy download by others. It can be put into a “shared location” or onto a file server for instant use by colleagues. The load of bulk downloads to many users can be eased by the use of “mirror” servers or peer-to-peer networks. In any of these cases, access to the file may be controlled by user authentication, the transit of the file over the Internet may be obscured by encryption, and money may change hands for access to the file. The price can be paid by the remote charging of funds from, for example, a credit card whose details are also passed – usually fully encrypted – across the Internet. The origin and authenticity of the file received may be checked by digital signatures or by MD5 or other message digests. These simple features of the Internet, over a worldwide basis, are changing the production, sale, and distribution of anything that can be reduced to a computer file for transmission. This includes all manner of print publications, software products, news, music, film, video, photography, graphics and the other arts. This in turn has caused seismic shifts in each of the existing industries that previously controlled the production and distribution of these products.

Streaming media is the real-time delivery of digital media for the immediate consumption or enjoyment by end users. Many radio and television broadcasters provide Internet feeds of their live audio and video productions. They may also allow time-shift viewing or listening such as Preview, Classic Clips and Listen Again features. These providers have been joined by a range of pure Internet “broadcasters” who never had on-air licenses. This means that an Internet-connected device, such as a computer or something more specific, can be used to access on-line media in much the same way as was previously possible only with a television or radio receiver. The range of available types of content is much wider, from specialized technical webcasts to on-demand popular multimedia services. Podcasting is a variation on this theme, where – usually audio – material is downloaded and played back on a computer or shifted to a portable media player to be listened to on the move. These techniques using simple equipment allow anybody, with little censorship or licensing control, to broadcast audio-visual material worldwide.

Digital media streaming increases the demand for network bandwidth. For example, standard image quality needs 1 Mbit/s link speed for SD 480p, HD 720p quality requires 2.5 Mbit/s, and the top-of-the-line HDX quality needs 4.5 Mbit/s for 1080p.

Webcams are a low-cost extension of this phenomenon. While some webcams can give full-frame-rate video, the picture either is usually small or updates slowly. Internet users can watch animals around an African waterhole, ships in the Panama Canal, traffic at a local roundabout or monitor their own premises, live and in real time. Video chat rooms and video conferencing are also popular with many uses being found for personal webcams, with and without two-way sound. YouTube was founded on 15 February 2005 and is now the leading website for free streaming video with a vast number of users. It uses a flash-based web player to stream and show video files. Registered users may upload an unlimited amount of video and build their own personal profile. YouTube claims that its users watch hundreds of millions, and upload hundreds of thousands of videos daily. Currently, YouTube also uses an HTML5 player.

Social impact

The Internet has enabled new forms of social interaction, activities, and social associations.

Users

Line graph showing that by 2014, nearly 80% of the developed world will use the internet, 40% of the world globally, and 31% in the developing world.

 

Internet users per 100 inhabitants

Internet users by language. English speakers constitute 27% of internet users, Chinese 25%, Others 17%, Spanish 8%, and various others at 5% or less.

 

Internet users by language.

Website content languages. 55% of websites have English content, 11% are in other languages, 6% are in RRice Universitysian, 5% in German, 5% in Spanish, 4% are in Chinese, 4% in French, 4% in Japanese, 3% in Arabic, and 2% in Portuguese.

 

Website content languages.

Overall Internet usage has seen tremendous growth. From 2000 to 2009, the number of Internet users globally rose from 394 million to 1.858 billion. By 2010, 22 percent of the world’s population had access to computers with 1 billion Google searches every day, 300 million Internet users reading blogs, and 2 billion videos viewed daily on YouTube. In 2014 the world’s Internet users surpassed 3 billion or 43.6 percent of world population, but two-thirds of the users came from richest countries, with 78.0 percent of Europe countries population using the Internet, followed by 57.4 percent of the Americas.

The prevalent language for communication on the Internet has been English. This may be a result of the origin of the Internet, as well as the language’s role as a lingua franca. Early computer systems were limited to the characters in the American Standard Code for Information Interchange (ASCII), a subset of the Latin alphabet.

After English (27%), the most requested languages on the World Wide Web are Chinese (25%), Spanish (8%), Japanese (5%), Portuguese and German (4% each), Arabic, French and Russian (3% each), and Korean (2%). By region, 42% of the world’s Internet users are based in Asia, 24% in Europe, 14% in North America, 10% in Latin America and the Caribbean taken together, 6% in Africa, 3% in the Middle East and 1% in Australia/Oceania. The Internet’s technologies have developed enough in recent years, especially in the use of Unicode, that good facilities are available for development and communication in the world’s widely used languages. However, some glitches such as mojibake (incorrect display of some languages’ characters) still remain.

In an American study in 2005, the percentage of men using the Internet was very slightly ahead of the percentage of women, although this difference reversed in those under 30. Men logged on more often, spent more time online, and were more likely to be broadband users, whereas women tended to make more use of opportunities to communicate (such as email). Men were more likely to use the Internet to pay bills, participate in auctions, and for recreation such as downloading music and videos. Men and women were equally likely to use the Internet for shopping and banking. More recent studies indicate that in 2008, women significantly outnumbered men on most social networking sites, such as Facebook and Myspace, although the ratios varied with age. In addition, women watched more streaming content, whereas men downloaded more. In terms of blogs, men were more likely to blog in the first place; among those who blog, men were more likely to have a professional blog, whereas women were more likely to have a personal blog.

According to forecasts by Euromonitor International, 44% of the world’s population will be users of the Internet by 2020. Splitting by country, in 2012 Iceland, Norway, Sweden, the Netherlands, and Denmark had the highest Internet penetration by the number of users, with 93% or more of the population with access.

Several neologisms exist that refer to Internet users: Netizen (as in as in “citizen of the net”) refers to those actively involved in improving online communities, the Internet in general or surrounding political affairs and rights such as free speech, Internaut refers to operators or technically highly capable users of the Internet, digital citizen refers to a person using the Internet in order to engage in society, politics, and government participation.

Usage

The Internet allows greater flexibility in working hours and location, especially with the spread of unmetered high-speed connections. The Internet can be accessed almost anywhere by numerous means, including through mobile Internet devices. Mobile phones, data cards, handheld game consoles and cellular routers allow users to connect to the Internet wirelessly. Within the limitations imposed by small screens and other limited facilities of such pocket-sized devices, the services of the Internet, including email and the web, may be available. Service providers may restrict the services offered and mobile data charges may be significantly higher than other access methods.

Educational material at all levels from pre-school to post-doctoral is available from websites. Examples range from CBeebies, through school and high-school revision guides and virtual universities, to access to top-end scholarly literature through the likes of Google Scholar. For distance education, help with homework and other assignments, self-guided learning, whiling away spare time, or just looking up more detail on an interesting fact, it has never been easier for people to access educational information at any level from anywhere. The Internet in general and the World Wide Web in particular are important enablers of both formal and informal education. Further, the Internet allows universities, in particular researchers from the social and behavioral sciences, to conduct research remotely via virtual laboratories, with profound changes in reach and generalizability of findings as well as in communication between scientists and in the publication of results.

The low cost and nearly instantaneous sharing of ideas, knowledge, and skills has made collaborative work dramatically easier, with the help of collaborative software. Not only can a group cheaply communicate and share ideas but the wide reach of the Internet allows such groups more easily to form. An example of this is the free software movement, which has produced, among other things, Linux, Mozilla Firefox, and OpenOffice.org. Internet chat, whether using an IRC chat room, an instant messaging system, or a social networking website, allows colleagues to stay in touch in a very convenient way while working at their computers during the day. Messages can be exchanged even more quickly and conveniently than via email. These systems may allow files to be exchanged, drawings and images to be shared, or voice and video contact between team members.

Content management systems allow collaborating teams to work on shared sets of documents simultaneously without accidentally destroying each other’s work. Business and project teams can share calendars as well as documents and other information. Such collaboration occurs in a wide variety of areas including scientific research, software development, conference planning, political activism and creative writing. Social and political collaboration is also becoming more widespread as both Internet access and computer literacy spread.

The Internet allows computer users to remotely access other computers and information stores easily, wherever they may be. They may do this with or without computer security, i.e. authentication and encryption technologies, depending on the requirements. This is encouraging new ways of working from home, collaboration and information sharing in many industries. An accountant sitting at home can audit the books of a company based in another country, on a server situated in a third country that is remotely maintained by IT specialists in a fourth. These accounts could have been created by home-working bookkeepers, in other remote locations, based on information emailed to them from offices all over the world. Some of these things were possible before the widespread use of the Internet, but the cost of private leased lines would have made many of them infeasible in practice. An office worker away from their desk, perhaps on the other side of the world on a business trip or a holiday, can access their emails, access their data using cloud computing, or open a remote desktop session into their office PC using a secure Virtual Private Network (VPN) connection on the Internet. This can give the worker complete access to all of their normal files and data, including email and other applications, while away from the office. It has been referred to among system administrators as the Virtual Private Nightmare, because it extends the secure perimeter of a corporate network into remote locations and its employees’ homes.

Social networking and entertainment

Many people use the World Wide Web to access news, weather and sports reports, to plan and book vacations and to pursue their personal interests. People use chat, messaging and email to make and stay in touch with friends worldwide, sometimes in the same way as some previously had pen pals.

Social networking websites such as Facebook, Twitter, and Myspace have created new ways to socialize and interact. Users of these sites are able to add a wide variety of information to pages, to pursue common interests, and to connect with others. It is also possible to find existing acquaintances, to allow communication among existing groups of people. Sites like LinkedIn foster commercial and business connections. YouTube and Flickr specialize in users’ videos and photographs.

While social networking sites were initially for individuals only, today they are widely used by businesses and other organizations to promote their brands, to market to their customers and to encourage posts to “go viral”. “Black hat” social media techniques are also employed by some organizations, such as spam accounts and astroturfing.

A risk for both individuals and organizations writing posts (especially public posts) on social networking websites, is that especially foolish or controversial posts occasionally lead to an unexpected and possibly large-scale backlash on social media from other internet users. This is also a risk in relation to controversial offline behavior, if it is widely made known. The nature of this backlash can range widely from counter-arguments and public mockery, through insults and hate speech, to, in extreme cases, rape and death threats. The online disinhibition effect describes the tendency of many individuals to behave more stridently or offensively online than they would in person. A significant number of feminist women have been the target of various forms of harassment in response to posts they have made on social media, and Twitter in particular has been criticised in the past for not doing enough to aid victims of online abuse.

For organizations, such a backlash can cause overall brand damage, especially if reported by the media. However, this is not always the case, as any brand damage in the eyes of people with an opposing opinion to that presented by the organization could sometimes be outweighed by strengthening the brand in the eyes of others. Furthermore, if an organization or individual gives in to demands that others perceive as wrong-headed, that can then provoke a counter-backlash.

Some websites, such as Reddit, have rules forbidding the posting of personal information of individuals (also known as doxxing), due to concerns about such postings leading to mobs of large numbers of Internet users directing harassment at the specific individuals thereby identified. In particular, the Reddit rule forbidding the posting of personal information is widely understood to imply that all identifying photos and names must be censored in Facebook screenshots posted to Reddit. However, the interpretation of this rule in relation to public Twitter posts is less clear, and in any case like-minded people online have many other ways they can use to direct each other’s attention to public social media posts they disagree with.

Children also face dangers online such as cyberbullying and approaches by sexual predators, who sometimes pose as children themselves. Children may also encounter material which they may find upsetting, or material which their parents consider to be not age-appropriate. Due to naivety, they may also post personal information about themselves online, which could put them or their families at risk, unless warned not to do so. Many parents choose to enable internet filtering, and/or supervise their children’s online activities, in an attempt to protect their children from inappropriate material on the internet. The most popular social networking websites, such as Facebook and Twitter, commonly forbid users under the age of 13. However, these policies are typically trivial to circumvent by registering an account with a false birth date, and a significant number of children aged under 13 join such sites anyway. Social networking sites for younger children, which claim to provide better levels of protection for children, also exist.

The Internet has been a major outlet for leisure activity since its inception, with entertaining social experiments such as MUDs and MOOs being conducted on university servers, and humor-related Usenet groups receiving much traffic. Today, many Internet forums have sections devoted to games and funny videos. Over 6 million people use blogs or message boards as a means of communication and for the sharing of ideas. The Internet pornography and online gambling industries have taken advantage of the World Wide Web, and often provide a significant source of advertising revenue for other websites. Although many governments have attempted to restrict both industries’ use of the Internet, in general this has failed to stop their widespread popularity.

Screen shot of a game showing trees with cartoonish eyeballs on the branches and cows up in the trees. The photo title is "Behold the Beef Bush" and the scene was created for a Gamespy contest.

 

Image created for a GameSpy contest.

Another area of leisure activity on the Internet is multiplayer gaming. This form of recreation creates communities, where people of all ages and origins enjoy the fast-paced world of multiplayer games. These range from MMORPG to first-person shooters, from role-playing video games to online gambling. While online gaming has been around since the 1970s, modern modes of online gaming began with subscription services such as GameSpy and MPlayer. Non-subscribers were limited to certain types of game play or certain games. Many people use the Internet to access and download music, movies and other works for their enjoyment and relaxation. Free and fee-based services exist for all of these activities, using centralized servers and distributed peer-to-peer technologies. Some of these sources exercise more care with respect to the original artists’ copyrights than others.

Internet usage has been correlated to users’ loneliness. Lonely people tend to use the Internet as an outlet for their feelings and to share their stories with others, such as in the “I am lonely will anyone speak to me” thread.

Cybersectarianism is a new organizational form which involves: “highly dispersed small groups of practitioners that may remain largely anonymous within the larger social context and operate in relative secrecy, while still linked remotely to a larger network of believers who share a set of practices and texts, and often a common devotion to a particular leader. Overseas supporters provide funding and support; domestic practitioners distribute tracts, participate in acts of resistance, and share information on the internal situation with outsiders. Collectively, members and practitioners of such sects construct viable virtual communities of faith, exchanging personal testimonies and engaging in collective study via email, on-line chat rooms and web-based message boards.” In particular, the British government has raised concerns about the prospect of young British Muslims being indoctrinated into Islamic extremism by material on the Internet, being persuaded to join terrorist groups such as the so-called “Islamic State”, and then potentially committing acts of terrorism on returning to Britain after fighting in Syria or Iraq.

Cyberslacking can become a drain on corporate resources; the average UK employee spent 57 minutes a day surfing the Web while at work, according to a 2003 study by Peninsula Business Services. Internet addiction disorder is excessive computer use that interferes with daily life. Psychologist Nicolas Carr believe that Internet use has other effects on individuals, for instance improving skills of scan-reading and interfering with the deep thinking that leads to true creativity.

Electronic business

Electronic business (e-business) encompasses business processes spanning the entire value chain: purchasing, supply chain management, marketing, sales, customer service, and business relationship. E-commerce seeks to add revenue streams using the Internet to build and enhance relationships with clients and partners.

According to International Data Corporation, the size of worldwide e-commerce, when global business-to-business and -consumer transactions are combined, equate to $16 trillion for 2013. A report by Oxford Economics adds those two together to estimate the total size of the digital economy at $20.4 trillion, equivalent to roughly 13.8% of global sales.

Drawbacks

While much has been written of the economic advantages of Internet-enabled commerce, there is also evidence that some aspects of the Internet such as maps and location-aware services may serve to reinforce economic inequality and the digital divide. Electronic commerce may be responsible for consolidation and the decline of mom-and-pop, brick and mortar businesses resulting in increases in income inequality.

Author Andrew Keen, a long-time critic of the social transformations caused by the Internet, has recently focused on the economic effects of consolidation from Internet businesses. Keen cites a 2013 Institute for Local Self-Reliance report saying brick-and-mortar retailers employ 47 people for every $10 million in sales, while Amazon employs only 14. Similarly, the 700-employee room rental start-up Airbnb was valued at $10 billion in 2014, about half as much as Hilton Hotels, which employs 152,000 people. And car-sharing Internet startup Uber employs 1,000 full-time employees and is valued at $18.2 billion, about the same valuation as Avis and Hertz combined, which together employ almost 60,000 people.

Telecommuting

Remote work is facilitated by tools such as groupware, virtual private networks, conference calling, videoconferencing, and Voice over IP (VOIP). It can be efficient and useful for companies as it allows workers to communicate over long distances, saving significant amounts of travel time and cost. As broadbandInternet connections become more commonplace, more and more workers have adequate bandwidth at home to use these tools to link their home to their corporate intranet and internal phone networks.

Crowdsourcing

Internet provides a particularly good venue for crowdsourcing (outsourcing tasks to a distributed group of people) since individuals tend to be more open in web-based projects where they are not being physically judged or scrutinized and thus can feel more comfortable sharing.

Crowdsourcing systems are used to accomplish a variety of tasks. For example, the crowd may be invited to develop a new technology, carry out a design task, refine or carry out the steps of an algorithm (see human-based computation), or help capture, systematize, or analyze large amounts of data (see also citizen science).

Wikis have also been used in the academic community for sharing and dissemination of information across institutional and international boundaries. In those settings, they have been found useful for collaboration on grant writing, strategic planning, departmental documentation, and committee work. The United States Patent and Trademark Office uses a wiki to allow the public to collaborate on finding prior art relevant to examination of pending patent applications. Queens, New York has used a wiki to allow citizens to collaborate on the design and planning of a local park.

The English Wikipedia has the largest user base among wikis on the World Wide Web and ranks in the top 10 among all Web sites in terms of traffic.

Politics and political revolutions

Woman walking past the banner prohibiting certain social media activities.

 

Banner in Bangkok during the 2014 Thai coup d’état, informing the Thai public that ‘like’ or ‘share’ activities on social media could result in imprisonment (observed June 30, 2014).

The Internet has achieved new relevance as a political tool. The presidential campaign of Howard Dean in 2004 in the United States was notable for its success in soliciting donation via the Internet. Many political groups use the Internet to achieve a new method of organizing for carrying out their mission, having given rise to Internet activism, most notably practiced by rebels in the Arab Spring.

The New York Times suggested that social media websites, such as Facebook and Twitter, helped people organize the political revolutions in Egypt, by helping activists organize protests, communicate grievances, and disseminate information.

The potential of the Internet as a civic tool of communicative power was explored by Simon R. B. Berdal in his 2004 thesis:

As the globally evolving Internet provides ever new access points to virtual discourse forums, it also promotes new civic relations and associations within which communicative power may flow and accumulate. Thus, traditionally … national-embedded peripheries get entangled into greater, international peripheries, with stronger combined powers… The Internet, as a consequence, changes the topology of the “centre-periphery” model, by stimulating conventional peripheries to interlink into “super-periphery” structures, which enclose and “besiege” several centers at once.

Berdal, therefore, extends the Habermasian notion of the Public sphere to the Internet, and underlines the inherent global and civic nature that interwoven Internet technologies provide. To limit the growing civic potential of the Internet, Berdal also notes how “self-protective measures” are put in place by those threatened by it:

If we consider China’s attempts to filter “unsuitable material” from the Internet, most of us would agree that this resembles a self-protective measure by the system against the growing civic potentials of the Internet. Nevertheless, both types represent limitations to “peripheral capacities”. Thus, the Chinese government tries to prevent communicative power to build up and unleash (as the 1989 Tiananmen Square uprising suggests, the government may find it wise to install “upstream measures”). Even though limited, the Internet is proving to be an empowering tool also to the Chinese periphery: Analysts believe that Internet petitions have influenced policy implementation in favor of the public’s online-articulated will …

Incidents of politically motivated Internet censorship have now been recorded in many countries, including western democracies.

Philanthropy

The spread of low-cost Internet access in developing countries has opened up new possibilities for peer-to-peer charities, which allow individuals to contribute small amounts to charitable projects for other individuals. Websites, such as DonorsChoose and GlobalGiving, allow small-scale donors to direct funds to individual projects of their choice.

A popular twist on Internet-based philanthropy is the use of peer-to-peer lending for charitable purposes. Kiva pioneered this concept in 2005, offering the first web-based service to publish individual loan profiles for funding. Kiva raises funds for local intermediary microfinance organizations which post stories and updates on behalf of the borrowers. Lenders can contribute as little as $25 to loans of their choice, and receive their money back as borrowers repay. Kiva falls short of being a pure peer-to-peer charity, in that loans are disbursed before being funded by lenders and borrowers do not communicate with lenders themselves.

However, the recent spread of low cost Internet access in developing countries has made genuine international person-to-person philanthropy increasingly feasible. In 2009 the US-based nonprofit Zidisha tapped into this trend to offer the first person-to-person microfinance platform to link lenders and borrowers across international borders without intermediaries. Members can fund loans for as little as a dollar, which the borrowers then use to develop business activities that improve their families’ incomes while repaying loans to the members with interest. Borrowers access the Internet via public cybercafes, donated laptops in village schools, and even smart phones, then create their own profile pages through which they share photos and information about themselves and their businesses. As they repay their loans, borrowers continue to share updates and dialogue with lenders via their profile pages. This direct web-based connection allows members themselves to take on many of the communication and recording tasks traditionally performed by local organizations, bypassing geographic barriers and dramatically reducing the cost of microfinance services to the entrepreneurs.

Security

Many computer scientists describe the Internet as a “prime example of a large-scale, highly engineered, yet highly complex system”. The structure was found to be highly robust to random failures, yet, very vulnerable to intentional attacks.

The Internet structure and its usage characteristics have been studied extensively and the possibility of developing alternative structures has been investigated.

Internet resources, hardware and software components, are the target of malicious attempts to gain unauthorized control to cause interruptions, or access private information. Such attempts include computer viruses which copy with the help of humans, computer worms which copy themselves automatically, denial of service attacks, ransomware, botnets, and spyware that reports on the activity and typing of users. Usually these activities constitute cybercrime. Defense theorists have also speculated about the possibilities of cyber warfare using similar methods on a large scale.

Surveillance

The vast majority of computer surveillance involves the monitoring of data and traffic on the Internet. In the United States for example, under the Communications Assistance For Law Enforcement Act, all phone calls and broadband Internet traffic (emails, web traffic, instant messaging, etc.) are required to be available for unimpeded real-time monitoring by Federal law enforcement agencies.

Packet capture (also sometimes referred to as “packet sniffing”) is the monitoring of data traffic on a computer network. Computers communicate over the Internet by breaking up messages (emails, images, videos, web pages, files, etc.) into small chunks called “packets”, which are routed through a network of computers, until they reach their destination, where they are assembled back into a complete “message” again. Packet Capture Appliance intercepts these packets as they are traveling through the network, in order to examine their contents using other programs. A packet capture is an information gathering tool, but not an analysis tool. That is it gathers “messages” but it does not analyze them and figure out what they mean. Other programs are needed to perform traffic analysis and sift through intercepted data looking for important/useful information. Under the Communications Assistance For Law Enforcement Act all U.S. telecommunications providers are required to install packet sniffing technology to allow Federal law enforcement and intelligence agencies to intercept all of their customers’ broadband Internet and voice over Internet protocol (VoIP) traffic.

There is far too much data gathered by these packet sniffers for human investigators to manually search through all of it. So automated Internet surveillance computers sift through the vast amount of intercepted Internet traffic, and filter out and report to human investigators those bits of information which are “interesting”—such as the use of certain words or phrases, visiting certain types of web sites, or communicating via email or chat with a certain individual or group. Billions of dollars per year are spent, by agencies such as the Information Awareness Office, NSA, GCHQ and the FBI, to develop, purchase, implement, and operate systems which intercept and analyze all of this data, and extract only the information which is useful to law enforcement and intelligence agencies.

Similar systems are now operated by Iranian secret police to identify and suppress dissidents. All required hardware and software has been allegedly installed by German Siemens AG and Finnish Nokia.

Censorship

 

Internet censorship and surveillance by country

 
Screen Shot 2015-07-23 at 3.04.57 PM

Some governments, such as those of Burma, Iran, North Korea, the Mainland China,Saudi Arabia and the United Arab Emirates restrict access to content on the Internet within their territories, especially to political and religious content, with domain name and keyword filters.

In Norway, Denmark, Finland, and Sweden, major Internet service providers have voluntarily agreed to restrict access to sites listed by authorities. While this list of forbidden resources is supposed to contain only known child pornography sites, the content of the list is secret. Many countries, including the United States, have enacted laws against the possession or distribution of certain material, such as child pornography, via the Internet, but do not mandate filter software. Many free or commercially available software programs, called content-control software are available to users to block offensive websites on individual computers or networks, in order to limit access by children to pornographic material or depiction of violence.

Performance

As the Internet is a heterogeneous network, the physical characteristics, including for example the data transfer rates of connections, vary widely. It exhibits emergent phenomena that depend on its large-scale organization.

Outages

An Internet blackout or outage can be caused by local signaling interruptions. Disruptions of submarine communications cables may cause blackouts or slowdowns to large areas, such as in the 2008 submarine cable disruption. Less-developed countries are more vulnerable due to a small number of high-capacity links. Land cables are also vulnerable, as in 2011 when a woman digging for scrap metal severed most connectivity for the nation of Armenia. Internet blackouts affecting almost entire countries can be achieved by governments as a form of Internet censorship, as in the blockage of the Internet in Egypt, whereby approximately 93% of networks were without access in 2011 in an attempt to stop mobilization for anti-government protests.

Energy use

In 2011 researchers estimated the energy used by the Internet to be between 170 and 307 GW, less than two percent of the energy used by humanity. This estimate included the energy needed to build, operate, and periodically replace the estimated 750 million laptops, a billion smart phones and 100 million servers worldwide as well as the energy that routers, cell towers, optical switches, Wi-Fi transmitters and cloud storage devices use when transmitting Internet traffic

 

Introduction

graphic with the words "World Wide Web"

The World Wide Web (wwwW3) is an information space where documents and other web resources are identified by URIs, interlinked by hypertext links, and can be accessed via the Internet. It has become known simply as the Web. Hypertext documents are commonly called web pages, which are primarily text documents formatted and annotated with the Hypertext Markup Language (HTML). Webpages may contain links to images, video, and software components that are rendered to users of a web browser application, running on the user’s computer, as coherent pages of multimedia content. Embedded hyperlinks permit users to navigate between web pages. When multiple web pages are published with a common theme or within a common domain name, the collection is usually called a web site.

British computer scientist Tim Berners-Lee is the inventor of the Web. As a CERN employee, Berners-Lee distributed a proposal on 12 March 1989 for what would eventually become the World Wide Web. The initial proposal intended a more effective CERN communication system, but Berners-Lee also realized the concept could be implemented throughout the world. Berners-Lee and Belgian computer scientist Robert Cailliau proposed in 1990 to use hypertext “to link and access information of various kinds as a web of nodes in which the user can browse at will”, and Berners-Lee finished the first website in December of that year. The first test was completed around 20 December 1990 and Berners-Lee reported about the project on the newsgroup alt.hypertext on 7 August 1991.

History

Photo of the computer.

 

The NeXT Computer used by Tim Berners-Lee at CERN.

On March 12, 1989, Tim Berners-Lee issued a proposal to the management at CERN that referenced ENQUIRE, a database and software project he had built in 1980, and described a more elaborate information management system based on links embedded in readable text: “Imagine, then, the references in this document all being associated with the network address of the thing to which they referred, so that while reading this document you could skip to them with a click of the mouse.” Such a system, he explained, could be referred to using one of the existing meanings of the word hypertext, a term that he says was coined in the 1950s. There is no reason, the proposal continues, why such hypertext links could not encompass multimedia documents including graphics, speech and video, so that Berners-Lee goes on to propose the term hypermedia.

With help from Robert Cailliau, he published a more formal proposal (on 12 November 1990) to build a “Hypertext project” called “WorldWideWeb” (one word, also “W3”) as a “web” of “hypertext documents” to be viewed by “browsers” using a client–server architecture. This proposal estimated that a read-only web would be developed within three months and that it would take six months to achieve “the creation of new links and new material by readers, [so that] authorship becomes universal” as well as “the automatic notification of a reader when new material of interest to him/her has become available.” While the read-only goal was met, accessible authorship of web content took longer to mature, with the wiki concept, WebDAV, blogs, Web 2.0 and RSS/Atom.

The proposal was modeled after the SGML reader Dynatext by Electronic Book Technology, a spin-off from the Institute for Research in Information and Scholarship at Brown University. The Dynatext system, licensed by CERN, was a key player in the extension of SGML ISO 8879:1986 to Hypermedia within HyTime, but it was considered too expensive and had an inappropriate licensing policy for use in the general high energy physics community, namely a fee for each document and each document alteration.

Photo of the CERN data center.

 

The CERN data center in 2010 housing some WWW servers

A NeXT Computer was used by Berners-Lee as the world’s first web server and also to write the first web browser,WorldWideWeb, in 1990. By Christmas 1990, Berners-Lee had built all the tools necessary for a working Web: the first web browser (which was a web editor as well); the first web server; and the first web pages, which described the project itself.

The first web page may be lost, but Paul Jones of UNC-Chapel Hill in North Carolina announced in May 2013 that Berners-Lee gave him what he says is the oldest known web page during a 1991 visit to UNC. Jones stored it on amagneto-optical drive and on his NeXT computer.

On 6 August 1991, Berners-Lee published a short summary of the World Wide Web project on the newsgroup alt.hypertext. This date also marked the debut of the Web as a publicly available service on the Internet, although new users only accessed it after 23 August. For this reason this is considered the internaut’s day. Several news media have reported that the first photo on the Web was published by Berners-Lee in 1992, an image of the CERN house band Les Horribles Cernettes taken by Silvano de Gennaro; Gennaro has disclaimed this story, writing that media were “totally distorting our words for the sake of cheap sensationalism.”

The first server outside Europe was installed at the Stanford Linear Accelerator Center (SLAC) in Palo Alto, California, to host the SPIRES-HEP database. Accounts differ substantially as to the date of this event. The World Wide Web Consortium says December 1992, whereas SLAC itself claims 1991. This is supported by a W3C document titled A Little History of the World Wide Web.

The underlying concept of hypertext originated in previous projects from the 1960s, such as the Hypertext Editing System (HES) at Brown University, Ted Nelson’s Project Xanadu, and Douglas Engelbart’s oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by Vannevar Bush’s microfilm-based memex, which was described in the 1945 essay “As We May Think”.

Berners-Lee’s breakthrough was to marry hypertext to the Internet. In his book Weaving The Web, he explains that he had repeatedly suggested that a marriage between the two technologies was possible to members of both technical communities, but when no one took up his invitation, he finally assumed the project himself. In the process, he developed three essential technologies:

  • a system of globally unique identifiers for resources on the Web and elsewhere, the universal document identifier (UDI), later known as uniform resource locator (URL) and uniform resource identifier (URI);
  • the publishing language HyperText Markup Language (HTML);
  • the Hypertext Transfer Protocol (HTTP).

The World Wide Web had a number of differences from other hypertext systems available at the time. The Web required only unidirectional links rather than bidirectional ones, making it possible for someone to link to another resource without action by the owner of that resource. It also significantly reduced the difficulty of implementing web servers and browsers (in comparison to earlier systems), but in turn presented the chronic problem of link rot. Unlike predecessors such as HyperCard, the World Wide Web was non-proprietary, making it possible to develop servers and clients independently and to add extensions without licensing restrictions. On 30 April 1993, CERN announced that the World Wide Web would be free to anyone, with no fees due. Coming two months after the announcement that the server implementation of the Gopher protocol was no longer free to use, this produced a rapid shift away from Gopher and towards the Web. An early popular web browser was ViolaWWW for Unix and the X Windowing System.

 

Robert Cailliau, Jean-François Abramatic of IBM, and Tim Berners-Lee at the 10th anniversary of the World Wide Web Consortium.

Scholars generally agree that a turning point for the World Wide Web began with the introduction of the Mosaic web browser in 1993, a graphical browser developed by a team at the National Center for Supercomputing Applications at the University of Illinois at Urbana-Champaign (NCSA-UIUC), led by Marc Andreessen. Funding for Mosaic came from the U.S. High-Performance Computing and Communications Initiative and the High Performance Computing and Communication Act of 1991, one of several computing developments initiated by U.S. Senator Al Gore. Prior to the release of Mosaic, graphics were not commonly mixed with text in web pages and the web’s popularity was less than older protocols in use over the Internet, such as Gopher and Wide Area Information Servers (WAIS). Mosaic’s graphical user interface allowed the Web to become, by far, the most popular Internet protocol.

The World Wide Web Consortium (W3C) was founded by Tim Berners-Lee after he left the European Organization for Nuclear Research (CERN) in October 1994. It was founded at the Massachusetts Institute of Technology Laboratory for Computer Science (MIT/LCS) with support from the Defense Advanced Research Projects Agency (DARPA), which had pioneered the Internet; a year later, a second site was founded at INRIA (a French national computer research lab) with support from the European Commission DG InfSo; and in 1996, a third continental site was created in Japan at Keio University. By the end of 1994, the total number of websites was still relatively small, but many notable websites were already active that foreshadowed or inspired today’s most popular services.

Connected by the existing Internet, other websites were created around the world, adding international standards for domain names and HTML. Since then, Berners-Lee has played an active role in guiding the development of web standards (such as the markup languages to compose web pages in), and has advocated his vision of a Semantic Web. The World Wide Web enabled the spread of information over the Internet through an easy-to-use and flexible format. It thus played an important role in popularizing use of the Internet. Although the two terms are sometimes conflated in popular use, World Wide Web is not synonymous with Internet. The Web is an information space containing hyperlinked documents and other resources, identified by their URIs. It is implemented as both client and server software using Internet protocols such as TCP/IP and HTTP.

Tim Berners-Lee was knighted in 2004 by Queen Elizabeth II for “services to the global development of the Internet”.

Function

Key layers of the Internet

 

The World Wide Web functions as a layer on top of the Internet, helping to make it more functional. The advent of the Mosaic web browser helped to make the web much more usable.

The terms Internet and World Wide Web are often used without much distinction. However, the two things are not the same. The Internet is a global system of interconnected computer networks. In contrast, the World Wide Web is one of the services transferred over these networks. It is a collection of text documents and other resources, linked by hyperlinks and URLs, usually accessed by web browsers, from web servers.

Viewing a web page on the World Wide Web normally begins either by typing the URL of the page into a web browser, or by following a hyperlink to that page or resource. The web browser then initiates a series of background communication messages to fetch and display the requested page. In the 1990s, using a browser to view web pages—and to move from one web page to another through hyperlinks—came to be known as ‘browsing,’ ‘web surfing,’ (after channel surfing), or ‘navigating the Web’. Early studies of this new behavior investigated user patterns in using web browsers. One study, for example, found five user patterns: exploratory surfing, window surfing, evolved surfing, bounded navigation and targeted navigation.

The following example demonstrates the functioning of a web browser when accessing a page at the URL http://example.org/wiki/World_Wide_Web. The browser resolves the server name of the URL (example.org) into an Internet Protocol address using the globally distributed Domain Name System (DNS). This lookup returns an IP address such as 203.0.113.4. The browser then requests the resource by sending an HTTP request across the Internet to the computer at that address. It requests service from a specific TCP port number that is well known for the HTTP service, so that the receiving host can distinguish an HTTP request from other network protocols it may be servicing. The HTTP protocol normally uses port number 80. The content of the HTTP request can be as simple as two lines of text:

GET /wiki/World_Wide_Web HTTP/1.1 Host: example.org 

The computer receiving the HTTP request delivers it to web server software listening for requests on port 80. If the web server can fulfill the request it sends an HTTP response back to the browser indicating success:

HTTP/1.0 200 OK Content-Type: text/html; charset=UTF-8 

followed by the content of the requested page. The Hypertext Markup Language for a basic web page looks like <html> <head> <title>Example.org – The World Wide Web</title> </head> <body> <p>The World Wide Web, abbreviated as WWW and commonly known …</p> </body> </html>

The web browser parses the HTML and interprets the markup (<title><p> for paragraph, and such) that surrounds the words to format the text on the screen. Many web pages use HTML to reference the URLs of other resources such as images, other embedded media, scripts that affect page behavior, and Cascading Style Sheets that affect page layout. The browser makes additional HTTP requests to the web server for these other Internet media types. As it receives their content from the web server, the browser progressively renders the page onto the screen as specified by its HTML and these additional resources.

Linking

Most web pages contain hyperlinks to other related pages and perhaps to downloadable files, source documents, definitions and other web resources. In the underlying HTML, a hyperlink looks like <a href=”http://example.org/wiki/Main_Page“>Example.org, a free encyclopedia</a>

Graphic representation of a minute fraction of the WWW, demonstrating hyperlinks.

 

Graphic representation of a minute fraction of the WWW, demonstrating hyperlinks.

Such a collection of useful, related resources, interconnected via hypertext links is dubbed a web of information. Publication on the Internet created what Tim Berners-Lee first called the WorldWideWeb (in its original CamelCase, which was subsequently discarded) in November 1990.

The hyperlink structure of the WWW is described by the webgraph: the nodes of the webgraph correspond to the web pages (or URLs) the directed edges between them to the hyperlinks.

Over time, many web resources pointed to by hyperlinks disappear, relocate, or are replaced with different content. This makes hyperlinks obsolete, a phenomenon referred to in some circles as link rot, and the hyperlinks affected by it are often called dead links. The ephemeral nature of the Web has prompted many efforts to archive web sites. The Internet Archive, active since 1996, is the best known of such efforts.

Dynamic updates of web pages

JavaScript is a scripting language that was initially developed in 1995 by Brendan Eich, then of Netscape, for use within web pages. The standardised version is ECMAScript. To make web pages more interactive, some web applications also use JavaScript techniques such as Ajax (asynchronous JavaScript and XML). Client-side script is delivered with the page that can make additional HTTP requests to the server, either in response to user actions such as mouse movements or clicks, or based on elapsed time. The server’s responses are used to modify the current page rather than creating a new page with each response, so the server needs only to provide limited, incremental information. Multiple Ajax requests can be handled at the same time, and users can interact with the page while data is retrieved. Web pages may also regularly poll the server to check whether new information is available.

WWW prefix

Many hostnames used for the World Wide Web begin with www because of the long-standing practice of naming Internet hosts according to the services they provide. The hostname of a web server is often www, in the same way that it may be ftp for an FTP server, and news or nntp for a USENET news server. These host names appear as Domain Name System (DNS) or subdomain names, as in www.example.com. The use of www is not required by any technical or policy standard and many web sites do not use it; indeed, the first ever web server was called nxoc01.cern.ch. According to Paolo Palazzi, who worked at CERN along with Tim Berners-Lee, the popular use of www as subdomain was accidental; the World Wide Web project page was intended to be published at www.cern.ch while info.cern.ch was intended to be the CERN home page, however the DNS records were never switched, and the practice of prepending www to an institution’s website domain name was subsequently copied. Many established websites still use the prefix, or they employ other subdomain names such as www2secure or en for special purposes. Many such web servers are set up so that both the main domain name (e.g., example.com) and the www subdomain (e.g., www.example.com) refer to the same site; others require one form or the other, or they may map to different web sites.

The use of a subdomain name is useful for load balancing incoming web traffic by creating a CNAME record that points to a cluster of web servers. Since, currently, only a subdomain can be used in a CNAME, the same result cannot be achieved by using the bare domain root.

When a user submits an incomplete domain name to a web browser in its address bar input field, some web browsers automatically try adding the prefix “www” to the beginning of it and possibly “.com”, “.org” and “.net” at the end, depending on what might be missing. For example, entering ‘microsoft’ may be transformed to http://www.microsoft.com/ and ‘openoffice’ to http://www.openoffice.org. This feature started appearing in early versions of Mozilla Firefox, when it still had the working title ‘Firebird’ in early 2003, from an earlier practice in browsers such as Lynx. It is reported that Microsoft was granted a US patent for the same idea in 2008, but only for mobile devices.

In English, www is usually read as double-u double-u double-u. Some users pronounce it dub-dub-dub, particularly in New Zealand. Stephen Fry, in his “Podgrammes” series of podcasts, pronounces it wuh wuh wuh. The English writer Douglas Adams once quipped in The Independent on Sunday(1999): “The World Wide Web is the only thing I know of whose shortened form takes three times longer to say than what it’s short for”. In Mandarin Chinese, World Wide Web is commonly translated via a phono-semantic matching to wàn wéi wǎng (万维网), which satisfies www and literally means “myriad dimensional net”, a translation that reflects the design concept and proliferation of the World Wide Web. Tim Berners-Lee’s web-space states that World Wide Web is officially spelled as three separate words, each capitalised, with no intervening hyphens.

Use of the www prefix is declining as Web 2.0 web applications seek to brand their domain names and make them easily pronounceable. As the mobile web grows in popularity, services like Gmail.com, MySpace.com, Facebook.com and Twitter.com are most often mentioned without adding “www.” (or, indeed, “.com”) to the domain.

Scheme specifiers

The scheme specifiers http:// and https:// at the start of a web URI refer to Hypertext Transfer Protocol or HTTP Secure, respectively. They specify the communication protocol to use for the request and response. The HTTP protocol is fundamental to the operation of the World Wide Web, and the added encryption layer in HTTPS is essential when browsers send or retrieve confidential data, such as passwords or banking information. Web browsers usually automatically prepend http:// to user-entered URIs, if omitted.

Web security

For criminals, the web has become the preferred way to spread malware. Cybercrime on the web can include identity theft, fraud, espionage and intelligence gathering. Web-based vulnerabilities now outnumber traditional computer security concerns, and as measured by Google, about one in ten web pages may contain malicious code. Most web-based attacks take place on legitimate websites, and most, as measured by Sophos, are hosted in the United States, China and Russia. The most common of all malware threats is SQL injection attacks against websites. Through HTML and URIs, the Web was vulnerable to attacks like cross-site scripting (XSS) that came with the introduction of JavaScript and were exacerbated to some degree by Web 2.0 and Ajax web design that favors the use of scripts.Today by one estimate, 70% of all websites are open to XSS attacks on their users. Phishing is another common threat to the Web. “SA, the Security Division of EMC, today announced the findings of its January 2013 Fraud Report, estimating the global losses from phishing at $1.5 Billion in 2012.” Two of the well-known phishing methods are Covert Redirect and Open Redirect.

Proposed solutions vary to extremes. Large security vendors like McAfee already design governance and compliance suites to meet post-9/11 regulations, and some, like Finjan have recommended active real-time inspection of code and all content regardless of its source. Some have argued that for enterprise to see security as a business opportunity rather than a cost center, “ubiquitous, always-on digital rights management” enforced in the infrastructure by a handful of organizations must replace the hundreds of companies that today secure data and networks. Jonathan Zittrain has said users sharing responsibility for computing safety is far preferable to locking down the Internet.

Privacy

Every time a client requests a web page, the server can identify the request’s IP address and usually logs it. Also, unless set not to do so, most web browsers record requested web pages in a viewable history feature, and usually cache much of the content locally. Unless the server-browser communication uses HTTPS encryption, web requests and responses travel in plain text across the internet and can be viewed, recorded, and cached by intermediate systems.

When a web page asks for, and the user supplies, personally identifiable information—such as their real name, address, e-mail address, etc.—web-based entities can associate current web traffic with that individual. If the website uses HTTP cookies, username and password authentication, or other tracking techniques, it can relate other web visits, before and after, to the identifiable information provided. In this way it is possible for a web-based organisation to develop and build a profile of the individual people who use its site or sites. It may be able to build a record for an individual that includes information about their leisure activities, their shopping interests, their profession, and other aspects of their demographic profile. These profiles are obviously of potential interest to marketeers, advertisers and others. Depending on the website’s terms and conditions and the local laws that apply information from these profiles may be sold, shared, or passed to other organisations without the user being informed. For many ordinary people, this means little more than some unexpected e-mails in their in-box, or some uncannily relevant advertising on a future web page. For others, it can mean that time spent indulging an unusual interest can result in a deluge of further targeted marketing that may be unwelcome. Law enforcement, counter terrorism and espionage agencies can also identify, target and track individuals based on their interests or proclivities on the Web.

Social networking sites try to get users to use their real names, interests, and locations. They believe this makes the social networking experience more realistic, and therefore more engaging for all their users. On the other hand, uploaded photographs or unguarded statements can be identified to an individual, who may regret this exposure. Employers, schools, parents, and other relatives may be influenced by aspects of social networking profiles that the posting individual did not intend for these audiences. On-line bullies may make use of personal information to harass or stalk users. Modern social networking websites allow fine grained control of the privacy settings for each individual posting, but these can be complex and not easy to find or use, especially for beginners.

14162668916_aaee0e61e2_k

Photographs and videos posted onto websites have caused particular problems, as they can add a person’s face to an on-line profile. With modern and potential facial recognition technology, it may then be possible to relate that face with other, previously anonymous, images, events and scenarios that have been imaged elsewhere. Because of image caching, mirroring and copying, it is difficult to remove an image from the World Wide Web.

Standards

Many formal standards and other technical specifications and software define the operation of different aspects of the World Wide Web, the Internet, and computer information exchange. Many of the documents are the work of the World Wide Web Consortium (W3C), headed by Berners-Lee, but some are produced by the Internet Engineering Task Force (IETF) and other organizations.

Usually, when web standards are discussed, the following publications are seen as foundational:

  • Recommendations for markup languages, especially HTML and XHTML, from the W3C. These define the structure and interpretation of hypertext documents.
  • Recommendations for stylesheets, especially CSS, from the W3C.
  • Standards for ECMAScript (usually in the form of JavaScript), from Ecma International.
  • Recommendations for the Document Object Model, from W3C.

Additional publications provide definitions of other essential technologies for the World Wide Web, including, but not limited to, the following:

  • Uniform Resource Identifier (URI), which is a universal system for referencing resources on the Internet, such as hypertext documents and images. URIs, often called URLs, are defined by the IETF’s RFC 3986 / STD 66: Uniform Resource Identifier (URI): Generic Syntax, as well as its predecessors and numerous URI scheme-defining RFCs;
  • HyperText Transfer Protocol (HTTP), especially as defined by RFC 2616: HTTP/1.1 and RFC 2617: HTTP Authentication, which specify how the browser and server authenticate each other.

Accessibility

There are methods for accessing the Web in alternative mediums and formats to facilitate use by individuals with disabilities. These disabilities may be visual, auditory, physical, speech related, cognitive, neurological, or some combination. Accessibility features also help people with temporary disabilities, like a broken arm, or aging users as their abilities change. The Web receives information as well as providing information and interacting with society. The World Wide Web Consortium claims it essential that the Web be accessible, so it can provide equal access and equal opportunity to people with disabilities. Tim Berners-Lee once noted, “The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect.” Many countries regulate web accessibility as a requirement for websites. International cooperation in the W3C Web Accessibility Initiative led to simple guidelines that web content authors as well as software developers can use to make the Web accessible to persons who may or may not be using assistive technology.

Internationalization

The W3C Internationalization Activity assures that web technology works in all languages, scripts, and cultures. Beginning in 2004 or 2005, Unicode gained ground and eventually in December 2007 surpassed both ASCII and Western European as the Web’s most frequently used character encoding. OriginallyRFC 3986 allowed resources to be identified by URI in a subset of US-ASCII. RFC 3987 allows more characters—any character in the Universal Character Set—and now a resource can be identified by IRI in any language.

Statistics

Between 2005 and 2010, the number of web users doubled, and was expected to surpass two billion in 2010. Early studies in 1998 and 1999 estimating the size of the Web using capture/recapture methods showed that much of the web was not indexed by search engines and the Web was much larger than expected. According to a 2001 study, there was a massive number, over 550 billion, of documents on the Web, mostly in the invisible Web, or Deep Web. A 2002 survey of 2,024 million web pages determined that by far the most web content was in the English language: 56.4%; next were pages in German (7.7%), French (5.6%), and Japanese (4.9%). A more recent study, which used web searches in 75 different languages to sample the Web, determined that there were over 11.5 billion web pages in the publicly indexable web as of the end of January 2005. As of March 2009, the indexable web contains at least 25.21 billion pages. On 25 July 2008, Google software engineers Jesse Alpert and Nissan Hajaj announced that Google Search had discovered one trillion unique URLs. As of May 2009, over 109.5 million domains operated. Of these, 74% were commercial or other domains operating in the generic top-level domain com.

Statistics measuring a website’s popularity are usually based either on the number of page views or on associated server ‘hits’ (file requests) that it receives.

Speed issues

Frustration over congestion issues in the Internet infrastructure and the high latency that results in slow browsing has led to a pejorative name for the World Wide Web: the World Wide Wait. Speeding up the Internet is an ongoing discussion over the use of peering and QoS technologies. Other solutions to reduce the congestion can be found at W3C. Guidelines for web response times are:

  • 0.1 second (one tenth of a second). Ideal response time. The user does not sense any interruption.
  • 1 second. Highest acceptable response time. Download times above 1 second interrupt the user experience.
  • 10 seconds. Unacceptable response time. The user experience is interrupted and the user is likely to leave the site or system.

Web caching

A web cache is a server computer located either on the public Internet, or within an enterprise that stores recently accessed web pages to improve response time for users when the same content is requested within a certain time after the original request.

Most web browsers also implement a browser cache for recently obtained data, usually on the local disk drive. HTTP requests by a browser may ask only for data that has changed since the last access. Web pages and resources may contain expiration information to control caching to secure sensitive data, such as in online banking, or to facilitate frequently updated sites, such as news media. Even sites with highly dynamic content may permit basic resources to be refreshed only occasionally. Web site designers find it worthwhile to collate resources such as CSS data and JavaScript into a few site-wide files so that they can be cached efficiently.

Enterprise firewalls often cache Web resources requested by one user for the benefit of many. Some search engines store cached content of frequently accessed websites.

 

Introduction

A photo of children's blocks that spell "URL."

uniform resource locator (URL) is a reference to a resource that specifies the location of the resource on a computer network and a mechanism for retrieving it. A URL is a specific type of uniform resource identifier (URI), although many people use the two terms interchangeably. A URL implies the means to access an indicated resource, which is not true of every URI. URLs occur most commonly to reference web pages (http), but are also used for file transfer (ftp), email (mailto), database access (JDBC), and many other applications.

Most web browsers display the URL of a web page above the page in an address bar. A typical URL has the form http://www.example.com/index.html, which indicates the protocol type (http), the domain name, (www.example.com), and the specific web page (index.html).

History

The Uniform Resource Locator was standardized in 1994 by Tim Berners-Lee and the URI working group of the Internet Engineering Task Force (IETF) as an outcome of collaboration started at the IETF Living Documents “Birds of a Feather” session in 1992. The format combines the pre-existing system of domain names (created in 1985) with file path syntax, where slashes are used to separate directory and file names. Conventions already existed where server names could be prepended to complete file paths, preceded by a double-slash (//).

Berners-Lee later regretted the use of dots to separate the parts of the domain name within URIs, wishing he had used slashes throughout. For example,http://www.example.com/path/to/name would have been written http:com/example/www/path/to/name. Berners-Lee has also said that, given the colon following the URI scheme, the two slashes before the domain name were also unnecessary.

Syntax

Every HTTP URL consists of the following, in the given order. Several schemes other than HTTP also share this general format, with some variation.

  • the scheme name (commonly called protocol, although not every URL scheme is a protocol, e.g. mailto is not a protocol)
  • a colon, two slashes,
  • a host, normally given as a domain name For example, http://www.example.com/path/to/name would have been written http:com/example/www/path/to/name but sometimes as a literal IP address
  • optionally a colon followed by a port number
  • the full path of the resource

The scheme says how to connect, the host specifies where to connect, and the remainder specifies what to ask for.

For programs such as Common Gateway Interface (CGI) scripts, this is followed by a query string, and an optional fragment identifier.

The syntax is:

scheme://[user:password@]domain:port/path?query_string#fragment_id

Component details:

  • The scheme, which in many cases is the name of a protocol (but not always), defines how the resource will be obtained. Examples include http, https, ftp, file and many others. Although schemes are case-insensitive, the canonical form is lowercase.
  • The domain name or literal numeric IP address gives the destination location for the URL. A literal numeric IPv6 address may be given, but must be enclosed in [ ] e.g.[db8:0cec::99:123a].
    The domain google.com, or its numeric IP address 173.194.34.5, is the address of Google’s website.
  • The domain name portion of a URL is not case sensitive since DNS ignores case:
    http://en.example.org/ and HTTP://EN.EXAMPLE.ORG/ both open the same page.
  • The port number, given in decimal, is optional; if omitted, the default for the scheme is used.
    For example, http://vnc.example.com:5800 connects to port 5800 of vnc.example.com, which may be appropriate for a VNC remote control session. If the port number is omitted for an http: URL, the browser will connect on port 80, the default HTTP port. The default port for an https: request is 443.
  • The path is used to specify and perhaps find the resource requested. This path may or may not describe folders on the file system in the web server. It may be very different from the arrangement of folders on the web server. It is case-sensitive, though it may be treated as case-insensitive by some servers, especially those based on Microsoft Windows.
    If the server is case sensitive and http://en.example.org/wiki/URL is correct, then http://en.example.org/WIKI/URL or http://en.example.org/wiki/url will display an HTTP 404 error page, unless these URLs point to valid resources themselves.
  • The query string contains data to be passed to software running on the server. It may contain name/value pairs separated by ampersands, for example
    ?first_name=John&last_name=Doe.
  • The fragment identifier, if present, specifies a part or a position within the overall resource or document.
    When used with HTML, it usually specifies a section or location within the page, and used in combination with Anchor elements or the “id” attribute of an element, the browser is scrolled to display that part of the page.

The scheme name defines the namespace, purpose, and the syntax of the remaining part of the URL. Software will try to process a URL according to its scheme and context. For example, a web browser will usually dereference the URL http://example.org:80 by performing an HTTP request to the host at example.org, using port number 80.

Other examples of scheme names include https, gopher, wais, ftp. URLs with https as a scheme (such as https://example.com/) require that requests and responses will be made over a secure connection to the website. Some schemes that require authentication allow a username, and perhaps a password too, to be embedded in the URL, for exampleftp://asmith@ftp.example.org. Passwords embedded in this way are not conducive to security, but the full possible syntax is

scheme://username:password@domain:port/path?query_string#fragment_id

Other schemes do not follow the HTTP pattern. For example, the mailto scheme only uses valid email addresses. When clicked on in an application, the URL mailto:bob@example.commay start an e-mail composer with the address bob@example.com in the To field. The tel scheme is even more different; it uses the public switched telephone network for addressing, instead of domain names representing Internet hosts.

List of allowed URL characters

Unreserved

The alphanumerical upper and lower case character may optionally be encoded:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
a b c d e f g h i j k l m n o p q r s t u v w x y z
0 1 2 3 4 5 6 7 8 9 – _ . ~

Reserved

Special symbols must sometimes be percent-encoded:

! * ‘ ( ) ; : @ & = + $ , / ? % # [ ]

Further details can for example be found in RFC 3986 and http://www.w3.org/Addressing/URL/uri-spec.html.

Relationship to URI

A URL is a URI that, in addition to identifying a web resource, provides a means of locating the resource by describing its “primary access mechanism (e.g., its network location)”.

Internet hostnames

A hostname is a domain name assigned to a host computer. This is usually a combination of the host’s local name with its parent domain’s name. For example, en.example.org consists of a local hostname (en) and the domain name example.org. The hostname is translated into an IP address via the local hosts file, or the domain name system (DNS) resolver. It is possible for a single host computer to have several hostnames; but generally the operating system of the host prefers to have one hostname that the host uses for itself.

Any domain name can also be a hostname, as long as the restrictions mentioned below are followed. For example, both “en.example.org” and “example.org” can be hostnames if they both have IP addresses assigned to them. The domain name “xyz.example.org” may not be a hostname if it does not have an IP address, but “aa.xyz.example.org” may still be a hostname. All hostnames are domain names, but not all domain names are hostnames.

URL protocols

The protocol, or scheme, of a URL defines how the resource will be obtained. Two common protocols on the web are HTTP and HTTPS. For various reasons, many sites have been switching to permitting access through both the HTTP and HTTPS protocols. Each protocol has advantages and disadvantages, including for some of the users that one or the other protocol either does not function, or is very undesirable. When a link contains a protocol specifier it results in the browser following the link using the specified protocol regardless of the potential desires of the user.

Protocol-relative URLs

It is possible to construct valid URLs without specifying a protocol which are called protocol-relative links (PRL) or protocol-relative URLs. Using PRLs on a page permits the viewer of the page to visit new pages using whichever protocol was used to obtain the page containing the link. This supports continuing to use whichever protocol the viewer has chosen to use for obtaining the current page when accessing new pages.

An example of a PRL is //en.wikipedia.org/wiki/Main_Page which is created by removing the protocol prefix.

Internationalized URL

Internet users are distributed throughout the world using a wide variety of languages and alphabets. Users expect to be able to create URLs in their own local alphabets.

An internationalized resource identifier (IRI) is a form of URL that includes Unicode characters. All modern browsers support IRIs. The parts of the URL requiring special treatment for different alphabets are the domain name and path.

The domain name in the IRI is known as an internationalized domain name (IDN). Web and Internet software automatically convert the domain name into punycode usable by the Domain Name System.

For example, the Chinese web site http://見.香港 becomes the following for DNS lookup. xn-- indicates the character was not originally ASCII.

http://xn--nw2a.xn--j6w193g/

The URL path name can also be specified by the user in the local alphabet. If not already encoded, it is converted to Unicode, and any characters not part of the basic URL character set are converted to English letters using percent-encoding.

For example, the following Japanese Web page http://domainname/引き割り.html becomes http://domainname/%E5%BC%95%E3%81%8D%E5%89%B2%E3%82%8A.html. The target computer decodes the address and displays the page.

 

Introduction

Composite graphic showing logos for four major Web browsers: Firefox, Safari, Chrome, and Internet Explorer

web browser (commonly referred to as a browser) is a software application for retrieving, presenting and traversing information resources on the World Wide Web. An information resource is identified by a Uniform Resource Identifier (URI/URL) and may be a web page, image, video or other piece of content. Hyperlinks present in resources enable users easily to navigate their browsers to related resources.

Although browsers are primarily intended to use the World Wide Web, they can also be used to access information provided by web servers in private networks or files in file systems.

The major web browsers are Firefox, Internet Explorer, Google Chrome, Opera, and Safari.

The first web browser was invented in 1990 by Sir Tim Berners-Lee. Berners-Lee is the director of the World Wide Web Consortium (W3C), which oversees the Web’s continued development, and is also the founder of the World Wide Web Foundation. His browser was called WorldWideWeb and later renamed Nexus.

Photograph of a seated Marc Anderson, smiling away from the camera and wearing an adidas zip-up jacket.

 

Marc Andreessen, inventor of Netscape

The first commonly available web browser with a graphical user interface was Erwise. The development of Erwise was initiated by Robert Cailliau.

In 1993, browser software was further innovated by Marc Andreessen with the release of Mosaic, “the world’s first popular browser”, which made the World Wide Web system easy to use and more accessible to the average person. Andreesen’s browser sparked the internet boom of the 1990s. The introduction of Mosaic in 1993 – one of the first graphical web browsers – led to an explosion in web use. Andreessen, the leader of the Mosaic team at National Center for Supercomputing Applications (NCSA), soon started his own company, named Netscape, and released the Mosaic-influenced Netscape Navigator in 1994, which quickly became the world’s most popular browser, accounting for 90% of all web use at its peak (see usage share of web browsers).

Microsoft responded with its Internet Explorer in 1995, also heavily influenced by Mosaic, initiating the industry’s first browser war. Bundled with Windows, Internet Explorer gained dominance in the web browser market; Internet Explorer usage share peaked at over 95% by 2002.

Opera debuted in 1996; it has never achieved widespread use, having less than 2% browser usage share as of February 2012 according to Net Applications. Its Opera-mini version has an additive share, in April 2011 amounting to 1.1% of overall browser use, but focused on the fast-growing mobile phone web browser market, being preinstalled on over 40 million phones. It is also available on several other embedded systems, including Nintendo’s Wii video game console.

In 1998, Netscape launched what was to become the Mozilla Foundation in an attempt to produce a competitive browser using the open source software model. That browser would eventually evolve into Firefox, which developed a respectable following while still in the beta stage of development; shortly after the release of Firefox 1.0 in late 2004, Firefox (all versions) accounted for 7% of browser use. As of August 2011, Firefox has a 28% usage share.

Apple’s Safari had its first beta release in January 2003; as of April 2011, it had a dominant share of Apple-based web browsing, accounting for just over 7% of the entire browser market.

The most recent major entrant to the browser market is Chrome, first released in September 2008. Chrome’s take-up has increased significantly year by year, by doubling its usage share from 8% to 16% by August 2011. This increase seems largely to be at the expense of Internet Explorer, whose share has tended to decrease from month to month. In December 2011, Chrome overtook Internet Explorer 8 as the most widely used web browser but still had lower usage than all versions of Internet Explorer combined. Chrome’s user-base continued to grow and in May 2012, Chrome’s usage passed the usage of all versions of Internet Explorer combined. By April 2014, Chrome’s usage had hit 45%.

Internet Explorer will be deprecated in Windows 10, with Microsoft Edge replacing it as the default web browser.

Business models

The ways that web browser makers fund their development costs has changed over time. The first web browser, WorldWideWeb, was a research project.

Netscape Navigator was sold commercially, as was Opera.

Internet Explorer, on the other hand, was bundled free with the Windows operating system (and was also downloadable free), and therefore it was funded partly by the sales of Windows to computer manufacturers and direct to users. Internet Explorer also used to be available for the Mac. It is likely that releasing IE for the Mac was part of Microsoft’s overall strategy to fight threats to its quasi-monopoly platform dominance – threats such as web standards and Java – by making some web developers, or at least their managers, assume that there was “no need” to develop for anything other than Internet Explorer. In this respect, IE may have contributed to Windows and Microsoft applications sales in another way, through “lock-in” to Microsoft’s browser.

In January 2009, the European Commission announced it would investigate the bundling of Internet Explorer with Windows operating systems from Microsoft, saying “Microsoft’s tying of Internet Explorer to the Windows operating system harms competition between web browsers, undermines product innovation and ultimately reduces consumer choice.”

Safari and Mobile Safari were likewise always included with OS X and iOS respectively, so, similarly, they were originally funded by sales of Apple computers and mobile devices, and formed part of the overall Apple experience to customers.

Today, most commercial web browsers are paid by search engine companies to make their engine default, or to include them as another option. For example, Google pays Mozilla, the maker of Firefox, to make Google Search the default search engine in Firefox. Mozilla makes enough money from this deal that it does not need to charge users for Firefox. In addition, Google Search is also (as one would expect) the default search engine in Google Chrome. Users searching for websites or items on the Internet would be led to Google’s search results page, increasing ad revenue and which funds development at Google and of Google Chrome.

Many less-well-known free software browsers, such as Konqueror, were hardly funded at all and were developed mostly by volunteers free of charge.

Function

The primary purpose of a web browser is to bring information resources to the user (“retrieval” or “fetching”), allowing them to view the information (“display”, “rendering”), and then access other information (“navigation”, “following links”).

This process begins when the user inputs a Uniform Resource Locator (URL), for example http://en.wikipedia.org/, into the browser. The prefix of the URL, the Uniform Resource Identifier or URI, determines how the URL will be interpreted. The most commonly used kind of URI starts with http: and identifies a resource to be retrieved over the Hypertext Transfer Protocol(HTTP).  Many browsers also support a variety of other prefixes, such as https: for HTTPS, ftp: for the File Transfer Protocol, and file: for local files. Prefixes that the web browser cannot directly handle are often handed off to another application entirely. For example, mailto: URIs are usually passed to the user’s default e-mail application, and news: URIs are passed to the user’s default newsgroup reader.

In the case of httphttpsfile, and others, once the resource has been retrieved the web browser will display it. HTML and associated content (image files, formatting information such as CSS, etc.) is passed to the browser’s layout engine to be transformed from markup to an interactive document, a process known as “rendering”. Aside from HTML, web browsers can generally display any kind of content that can be part of a web page. Most browsers can display images, audio, video, and XML files, and often have plug-ins to support Flash applications and Java applets. Upon encountering a file of an unsupported type or a file that is set up to be downloaded rather than displayed, the browser prompts the user to save the file to disk.

Information resources may contain hyperlinks to other information resources. Each link contains the URI of a resource to go to. When a link is clicked, the browser navigates to the resource indicated by the link’s target URI, and the process of bringing content to the user begins again.

Features

Available web browsers range in features from minimal, text-based user interfaces with bare-bones support for HTML to rich user interfaces supporting a wide variety of file formats and protocols. Browsers which include additional components to support e-mail, Usenet news, and Internet Relay Chat (IRC), are sometimes referred to as “Internet suites” rather than merely “web browsers”.

All major web browsers allow the user to open multiple information resources at the same time, either in different browser windows or in different tabs of the same window. Major browsers also include pop-up blockers to prevent unwanted windows from “popping up” without the user’s consent.

Screen shot of a list of browser bookmarks

 

Browser bookmarks

Most web browsers can display a list of web pages that the user has bookmarked so that the user can quickly return to them. Bookmarks are also called “Favorites” in Internet Explorer. In addition, all major web browsers have some form of built-in web feed aggregator. In Firefox, web feeds are formatted as “live bookmarks” and behave like a folder of bookmarks corresponding to recent entries in the feed. In Opera, a more traditional feed reader is included which stores and displays the contents of the feed.

Furthermore, most browsers can be extended via plug-ins, downloadable components that provide additional features.

User interface

Photo of LG Smart TV on-screen keyboard.

 

Some home media devices now include web browsers, like this LG Smart TV. The browser is controlled using an on-screen keyboard and LG’s “Magic Motion” remote.

Most major web browsers have these user interface elements in common:

  • Back and forward buttons to go back to the previous resource and forward respectively.
  • refresh or reload button to reload the current resource.
  • stop button to cancel loading the resource. In some browsers, the stop button is merged with the reload button.
  • home button to return to the user’s home page.
  • An address bar to input the Uniform Resource Identifier (URI) of the desired resource and display it.
  • A search bar to input terms into a search engine. In some browsers, the search bar is merged with the address bar.
  • A status bar to display progress in loading the resource and also the URI of links when the cursor hovers over them, and page zooming capability.
  • The viewport, the visible area of the webpage within the browser window.
  • The ability to view the HTML source for a page.

Major browsers also possess incremental find features to search within a web page.

Privacy and security

Most browsers support HTTP Secure and offer quick and easy ways to delete the web cache, cookies, and browsing history. For a comparison of the current security vulnerabilities of browsers, see comparison of web browsers.

Standards support

Early web browsers supported only a very simple version of HTML. The rapid development of proprietary web browsers led to the development of non-standard dialects of HTML, leading to problems with interoperability. Modern web browsers support a combination of standards-based and de facto HTML and XHTML, which should be rendered in the same way by all browsers.

Extensibility

A browser extension is a computer program that extends the functionality of a web browser. Every major web browser supports the development of browser extensions.

Components

Web browsers consist of a user interface, layout engine, rendering engine, JavaScript interpreter, UI backend, networking component and data persistence component. These components achieve different functionalities of a web browser and together provide all capabilities of a web browser.

 name server is a computer hardware or software server that implements a network service for providing responses to queries against a directory service. It translates an often humanly meaningful, text-based identifier to a system-internal, often numeric identification or addressing component. This service is performed by the server in response to a service protocol request.

An example of a name server is the server component of the Domain Name System (DNS), one of the two principal name spaces of the Internet. The most important function of DNS servers is the translation (resolution) of human-memorable domain names and hostnames into the corresponding numeric Internet Protocol (IP) addresses, the second principal name space of the Internet which is used to identify and locate computer systems and resources on the Internet.

Domain Name System

The Internet maintains two principal namespaces: the domain name hierarchy and the IP address system. The Domain Name System maintains the domain namespace and provides translation services between these two namespaces. Internet name servers implement the Domain Name System. The top hierarchy of the Domain Name System is served by the root name servers maintained by delegation by the Internet Corporation for Assigned Names and Numbers (ICANN). Below the root, Internet resources are organized into a hierarchy of domains, administered by the respective registrars and domain name holders. A DNS name server is a server that stores the DNS records, such as address (A, AAAA) records, name server (NS) records, and mail exchanger (MX) records for a domain name (see also List of DNS record types) and responds with answers to queries against its database.

Authoritative name server

Authoritative name server is a name server that gives answers in response to questions asked about names in a zone. An authoritative-only name server returns answers only to queries about domain names that have been specifically configured by the administrator. Name servers can also be configured to give authoritative answers to queries in some zones, while acting as a caching name server for all other zones.

An authoritative name server can either be a primary server (master) or a secondary server (slave). A primary server for a zone is the server that stores the definitive versions of all records in that zone. A secondary server for a zone uses an automatic updating mechanism to maintain an identical copy of the primary server’s database for a zone. Examples of such mechanisms include DNS zone transfers and file transfer protocols. DNS provides a mechanism whereby the primary for a zone can notify all the known secondaries for that zone when the contents of the zone have changed. The contents of a zone are either manually configured by an administrator, or managed using Dynamic DNS.

Every domain name appears in a zone served by one or more authoritative name servers. The fully qualified domain names of the authoritative name servers of a zone are listed in the NS records of that zone. If the server for a zone is not also authoritative for its parent zone, the server for the parent zone must be configured with a delegation for the zone.

When a domain is registered with a domain name registrar, the zone administrator provides the list of name servers (typically at least two, for redundancy) that are authoritative for the zone that contains the domain. The registrar provides the names of these servers to the domain registry for the top level domain containing the zone. The domain registry in turn configures the authoritative name servers for that top level domain with delegations for each server for the zone. If the fully qualified domain name of any name server for a zone appears within that zone, the zone administrator provides IP addresses for that name server, which are installed in the parent zone as glue records; otherwise, the delegation consists of the list of NS records for that zone.

Authoritative answer

A name server indicates that its response is authoritative by setting the Authoritative Answer (AA) bit in the response to a query on a name for which it is authoritative. Name servers providing answers for which they are not authoritative (for example, name servers for parent zones), do not set the AA bit.

Recursive query

Photo of the center of a purple and yellow daisy. The petals spiral into the center.

If a name server cannot answer a query because it does not contain an entry for the host in its database, it may recursively query name servers higher up in the hierarchy. This is known as a recursive query or recursive lookup. In principle, authoritative name servers suffice for the operation of the Internet. However, with only authoritative name-servers operating, every DNS query must start with recursive queries at the root zone of the Domain Name System and each user system must implement resolver software capable of recursive operation.

Caching name server

Caching name servers (DNS caches) store DNS query results for a period of time determined in the configuration (time-to-live) of each domain-name record. DNS caches improve the efficiency of the DNS by reducing DNS traffic across the Internet, and by reducing load on authoritative name-servers, particularly root name-servers. Because they can answer questions more quickly, they also increase the performance of end-user applications that use the DNS. Recursive name servers resolve any query they receive, even if they are not authoritative for the question being asked, by consulting the server or servers that are authoritative for the question. Caching name servers are often also recursive name servers—they perform every step necessary to answer any DNS query they receive. To do this the name server queries each authoritative name-server in turn, starting from the DNS root zone. It continues until it reaches the authoritative server for the zone that contains the queried domain name. That server provides the answer to the question, or definitively says it can’t be answered, and the caching resolver then returns this response to the client that asked the question. The authority, resolving and caching functions can all be present in a DNS server implementation, but this is not required: a DNS server can implement any one of these functions alone, without implementing the others. Internet service providers typically provide caching resolvers for their customers. In addition, many home-networking routers implement caching resolvers to improve efficiency in the local network. Some systems utilize nscd—the name service caching daemon.

Microsoft networking

Name servers also exist on some Microsoft Windows networks where one host assumes the role of NetBIOS browse master and performs as a NBNS server. Small local area networks of Windows systems require no central name server, and generally perform name-resolution using a broadcast algorithm.

The Windows Internet Name Service (WINS) is a name service that translates NetBIOS names to numerical addresses.

 

Introduction

Photo of a tiny house being pulled on a trailer on a road.

An Internet Protocol address (IP address) is a numerical label assigned to each device (e.g., computer, printer) participating in a computer network that uses the Internet Protocol for communication. An IP address serves two principal functions: host or network interface identification and location addressing. Its role has been characterized as follows: “A name indicates what we seek. An address indicates where it is. A route indicates how to get there.”

The designers of the Internet Protocol defined an IP address as a 32-bit number and this system, known as Internet Protocol Version 4 (IPv4), is still in use today. However, because of the growth of the Internet and the predicted depletion of available addresses, a new version of IP (IPv6), using 128 bits for the address, was developed in 1995. IPv6 was standardized as RFC 2460 in 1998, and its deployment has been ongoing since the mid-2000s.

IP addresses are usually written and displayed in human-readable notations, such as 172.16.254.1 (IPv4), and 2001:db8:0:1234:0:567:8:1 (IPv6).

The Internet Assigned Numbers Authority (IANA) manages the IP address space allocations globally and delegates five regional Internet registries (RIRs) to allocate IP address blocks to local Internet registries (Internet service providers) and other entities.

IP versions

Two versions of the Internet Protocol (IP) are in use: IP Version 4 and IP Version 6. Each version defines an IP address differently. Because of its prevalence, the generic term IP address typically still refers to the addresses defined by IPv4. The gap in version sequence between IPv4 and IPv6 resulted from the assignment of number 5 to the experimental Internet Stream Protocol in 1979, which however was never referred to as IPv5.

IPv4 addresses

Example of an IP address converted into binary code. For example 172 changes to 10101100. One byte equals eight bits.

 

Decomposition of an IPv4 address from dot-decimal notation to its binary value.

In IPv4 an address consists of 32 bits which limits the address space to 4294967296 (232) possible unique addresses. IPv4 reserves some addresses for special purposes such as private networks (~18 million addresses) or multicast addresses (~270 million addresses).

IPv4 addresses are canonically represented in dot-decimal notation, which consists of four decimal numbers, each ranging from 0 to 255, separated by dots, e.g., 172.16.254.1. Each part represents a group of 8 bits (octet) of the address. In some cases of technical writing, IPv4 addresses may be presented in various hexadecimal, octal, or binary representations.

Subnetting

In the early stages of development of the Internet Protocol, network administrators interpreted an IP address in two parts: network number portion and host number portion. The highest order octet (most significant eight bits) in an address was designated as the network number and the remaining bits were called the rest field or host identifier and were used for host numbering within a network.

This early method soon proved inadequate as additional networks developed that were independent of the existing networks already designated by a network number. In 1981, the Internet addressing specification was revised with the introduction of classful network architecture.

Classful network design allowed for a larger number of individual network assignments and fine-grained subnetwork design. The first three bits of the most significant octet of an IP address were defined as the class of the address. Three classes (AB, and C) were defined for universal unicast addressing. Depending on the class derived, the network identification was based on octet boundary segments of the entire address. Each class used successively additional octets in the network identifier, thus reducing the possible number of hosts in the higher order classes (B and C). The following table gives an overview of this now obsolete system.

Historical classful network architecture
ClassLeading
bits
Size of network
number
 bit field
Size of rest
bit field
Number
of networks
Addresses
per network
Start addressEnd address
A0824128 (27)16,777,216 (224)0.0.0.0127.255.255.255
B10161616,384 (214)65,536 (216)128.0.0.0191.255.255.255
C1102482,097,152 (221)256 (28)192.0.0.0223.255.255.255

Classful network design served its purpose in the startup stage of the Internet, but it lacked scalability in the face of the rapid expansion of the network in the 1990s. The class system of the address space was replaced with Classless Inter-Domain Routing (CIDR) in 1993. CIDR is based on variable-length subnet masking (VLSM) to allow allocation and routing based on arbitrary-length prefixes.

Today, remnants of classful network concepts function only in a limited scope as the default configuration parameters of some network software and hardware components (e.g. netmask), and in the technical jargon used in network administrators’ discussions.

Private addresses

Early network design, when global end-to-end connectivity was envisioned for communications with all Internet hosts, intended that IP addresses be uniquely assigned to a particular computer or device. However, it was found that this was not always necessary as private networks developed and public address space needed to be conserved.

Computers not connected to the Internet, such as factory machines that communicate only with each other via TCP/IP, need not have globally unique IP addresses. Three ranges of IPv4 addresses for private networks were reserved in RFC 1918. These addresses are not routed on the Internet and thus their use need not be coordinated with an IP address registry.

Today, when needed, such private networks typically connect to the Internet through network address translation (NAT).

IANA-reserved private IPv4 network ranges
StartEndNo. of addresses
24-bit block (/8 prefix, 1 × A)10.0.0.010.255.255.25516777216
20-bit block (/12 prefix, 16 × B)172.16.0.0172.31.255.2551048576
16-bit block (/16 prefix, 256 × C)192.168.0.0192.168.255.25565536

Any user may use any of the reserved blocks. Typically, a network administrator will divide a block into subnets; for example, many home routers automatically use a default address range of 192.168.0.0 through 192.168.0.255 (192.168.0.0/24).

IPv4 address exhaustion

High levels of demand have decreased the supply of unallocated Internet Protocol Version 4 (IPv4) addresses available for assignment to Internet service providers and end user organizations since the 1980s. This development is referred to as IPv4 address exhaustion. IANA’s primary address pool was exhausted on 3 February 2011, when the last five blocks were allocated to the five RIRs.[5][6] APNIC was the first RIR to exhaust its regional pool on 15 April 2011, except for a small amount of address space reserved for the transition to IPv6, intended to be allocated in a restricted process.[7]

IPv6 addresses

 

Decomposition of an IPv6 address from hexadecimal representation to its binary value.

The rapid exhaustion of IPv4 address space prompted the Internet Engineering Task Force (IETF) to explore new technologies to expand the addressing capability in the Internet. The permanent solution was deemed to be a redesign of the Internet Protocol itself. This new generation of the Internet Protocol was eventually named Internet Protocol Version 6 (IPv6) in 1995. The address size was increased from 32 to 128 bits (16 octets), thus providing up to 2128 (approximately 3.403×1038) addresses. This is deemed sufficient for the foreseeable future.

The intent of the new design was not to provide just a sufficient quantity of addresses, but also redesign routing in the Internet by more efficient aggregation of subnetwork routing prefixes. The resulted in slower growth of routing tables in routers. The smallest possible individual allocation is a subnet for 264 hosts, which is the square of the size of the entire IPv4 Internet. At these levels, actual address utilization rates will be small on any IPv6 network segment. The new design also provides the opportunity to separate the addressing infrastructure of a network segment, i.e. the local administration of the segment’s available space, from the addressing prefix used to route traffic to and from external networks. IPv6 has facilities that automatically change the routing prefix of entire networks, should the global connectivity or the routing policy change, without requiring internal redesign or manual renumbering.

The large number of IPv6 addresses allows large blocks to be assigned for specific purposes and, where appropriate, to be aggregated for efficient routing. With a large address space, there is no need to have complex address conservation methods as used in CIDR.

All modern desktop and enterprise server operating systems include native support for the IPv6 protocol, but it is not yet widely deployed in other devices, such as residential networking routers, voice over IP (VoIP) and multimedia equipment, and network peripherals.

Private addresses

Just as IPv4 reserves addresses for private networks, blocks of addresses are set aside in IPv6. In IPv6, these are referred to as unique local addresses (ULA). RFC 4193 reserves the routing prefix fc00::/7 for this block which is divided into two /8 blocks with different implied policies. The addresses include a 40-bit pseudorandom number that minimizes the risk of address collisions if sites merge or packets are misrouted.[8]

Early practices used a different block for this purpose (fec0::), dubbed site-local addresses. However, the definition of what constituted sites remained unclear and the poorly defined addressing policy created ambiguities for routing. This address type was abandoned and must not be used in new systems.

Addresses starting with fe80:, called link-local addresses, are assigned to interfaces for communication on the attached link. The addresses are automatically generated by the operating system for each network interface. This provides instant and automatic communication between all IPv6 host on a link. This feature is required in the lower layers of IPv6 network administration, such as for the Neighbor Discovery Protocol.

Private address prefixes may not be routed on the public Internet.

IP subnetworks

IP networks may be divided into subnetworks in both IPv4 and IPv6. For this purpose, an IP address is logically recognized as consisting of two parts: the network prefix and the host identifier, or interface identifier (IPv6). The subnet mask or the CIDR prefix determines how the IP address is divided into network and host parts.

The term subnet mask is only used within IPv4. Both IP versions however use the CIDR concept and notation. In this, the IP address is followed by a slash and the number (in decimal) of bits used for the network part, also called the routing prefix. For example, an IPv4 address and its subnet mask may be 192.0.2.1 and 255.255.255.0, respectively. The CIDR notation for the same IP address and subnet is 192.0.2.1/24, because the first 24 bits of the IP address indicate the network and subnet.

IP address assignment

Internet Protocol addresses are assigned to a host either anew at the time of booting, or permanently by fixed configuration of its hardware or software. Persistent configuration is also known as using a static IP address. In contrast, in situations when the computer’s IP address is assigned newly each time, this is known as using a dynamic IP address.

Methods

Static IP addresses are manually assigned to a computer by an administrator. The exact procedure varies according to platform. This contrasts with dynamic IP addresses, which are assigned either by the computer interface or host software itself, as in Zeroconf, or assigned by a server using Dynamic Host Configuration Protocol (DHCP). Even though IP addresses assigned using DHCP may stay the same for long periods of time, they can generally change. In some cases, a network administrator may implement dynamically assigned static IP addresses. In this case, a DHCP server is used, but it is specifically configured to always assign the same IP address to a particular computer. This allows static IP addresses to be configured centrally, without having to specifically configure each computer on the network in a manual procedure.

In the absence or failure of static or stateful (DHCP) address configurations, an operating system may assign an IP address to a network interface using state-less auto-configuration methods, such as Zeroconf.

Uses of dynamic address assignment

IP addresses are most frequently assigned dynamically on LANs and broadband networks by the Dynamic Host Configuration Protocol (DHCP). They are used because it avoids the administrative burden of assigning specific static addresses to each device on a network. It also allows many devices to share limited address space on a network if only some of them will be online at a particular time. In most current desktop operating systems, dynamic IP configuration is enabled by default so that a user does not need to manually enter any settings to connect to a network with a DHCP server. DHCP is not the only technology used to assign IP addresses dynamically. Dialup and some broadband networks use dynamic address features of the Point-to-Point Protocol.

Sticky dynamic IP address

sticky dynamic IP address is an informal term used by cable and DSL Internet access subscribers to describe a dynamically assigned IP address which seldom changes. The addresses are usually assigned with DHCP. Since the modems are usually powered on for extended periods of time, the address leases are usually set to long periods and simply renewed. If a modem is turned off and powered up again before the next expiration of the address lease, it will most likely receive the same IP address.

Address autoconfiguration

RFC 3330 defines an address block, 169.254.0.0/16, for the special use in link-local addressing for IPv4 networks. In IPv6, every interface, whether using static or dynamic address assignments, also receives a local-link address automatically in the block fe80::/10.

These addresses are only valid on the link, such as a local network segment or point-to-point connection, that a host is connected to. These addresses are not routable and like private addresses cannot be the source or destination of packets traversing the Internet.

When the link-local IPv4 address block was reserved, no standards existed for mechanisms of address autoconfiguration. Filling the void, Microsoft created an implementation that is called Automatic Private IP Addressing (APIPA). APIPA has been deployed on millions of machines and has, thus, become a de facto standard in the industry. In RFC 3927, the IETF defined a formal standard for this functionality, entitled Dynamic Configuration of IPv4 Link-Local Addresses.

Uses of static addressing

Some infrastructure situations have to use static addressing, such as when finding the Domain Name System (DNS) host that will translate domain names to IP addresses. Static addresses are also convenient, but not absolutely necessary, to locate servers inside an enterprise. An address obtained from a DNS server comes with a time to live, or caching time, after which it should be looked up to confirm that it has not changed. Even static IP addresses do change as a result of network administration (RFC 2072).

Routing

Photo at night of highway and city lights

IP addresses are classified into several classes of operational characteristics: unicast, multicast, anycast and broadcast addressing.

Unicast addressing

The most common concept of an IP address is in unicast addressing, available in both IPv4 and IPv6. It normally refers to a single sender or a single receiver, and can be used for both sending and receiving. Usually, a unicast address is associated with a single device or host, but a device or host may have more than one unicast address. Some individual PCs have several distinct unicast addresses, each for its own distinct purpose. Sending the same data to multiple unicast addresses requires the sender to send all the data many times over, once for each recipient.

Broadcast addressing

In IPv4 it is possible to send data to all possible destinations (“all-hosts broadcast”), which permits the sender to send the data only once, and all receivers receive a copy of it. In the IPv4 protocol, the address 255.255.255.255 is used for local broadcast. In addition, a directed (limited) broadcast can be made by combining the network prefix with a host suffix composed entirely of binary 1s. For example, the destination address used for a directed broadcast to devices on the 192.0.2.0/24 network is 192.0.2.255. IPv6 does not implement broadcast addressing and replaces it with multicast to the specially-defined all-nodes multicast address.

Multicast addressing

A multicast address is associated with a group of interested receivers. In IPv4, addresses 224.0.0.0 through 239.255.255.255 (the former Class D addresses) are designated as multicast addresses. IPv6 uses the address block with the prefix ff00::/8 for multicast applications. In either case, the sender sends a single datagram from its unicast address to the multicast group address and the intermediary routers take care of making copies and sending them to all receivers that have joined the corresponding multicast group.

Anycast addressing

Like broadcast and multicast, anycast is a one-to-many routing topology. However, the data stream is not transmitted to all receivers, just the one which the router decides is logically closest in the network. Anycast address is an inherent feature of only IPv6. In IPv4, anycast addressing implementations typically operate using the shortest-path metric of BGP routingand do not take into account congestion or other attributes of the path. Anycast methods are useful for global load balancing and are commonly used in distributed DNS systems.

Public addresses

public IP address, in common parlance, is synonymous with a globally routable unicast IP address.

Both IPv4 and IPv6 define address ranges that are reserved for private networks and link-local addressing. The term public IP address often used excludes these types of addresses.

Modifications to IP addressing

IP blocking and firewalls

Firewalls perform Internet Protocol blocking to protect networks from unauthorized access. They are common on today’s Internet. They control access to networks based on the IP address of a client computer. Whether using a blacklist or a whitelist, the IP address that is blocked is the perceived IP address of the client, meaning that if the client is using a proxy server or network address translation, blocking one IP address may block many individual computers.

IP address translation

Multiple client devices can appear to share IP addresses: either because they are part of a shared hosting web server environment or because an IPv4 network address translator (NAT) or proxy server acts as an intermediary agent on behalf of its customers, in which case the real originating IP addresses might be hidden from the server receiving a request. A common practice is to have a NAT hide a large number of IP addresses in a private network. Only the “outside” interface(s) of the NAT need to have Internet-routable addresses.

Most commonly, the NAT device maps TCP or UDP port numbers on the side of the larger, public network to individual private addresses on the masqueraded network.

In small home networks, NAT functions are usually implemented in a residential gateway device, typically one marketed as a “router”. In this scenario, the computers connected to the router would have private IP addresses and the router would have a public address to communicate on the Internet. This type of router allows several computers to share one public IP address.

Diagnostic tools

Computer operating systems provide various diagnostic tools to examine their network interface and address configuration. Windows provides the command-line interface tools ipconfig and netsh and users of Unix-like systems can use ifconfig, netstat, route, lanstat, fstat, or iproute2 utilities to accomplish the task.

 Internet security is a tree branch of computer security specifically related to the Internet, often involving browser security but also network security on a more general level as it applies to other applications or operating systems on a whole. Its objective is to establish rules and measures to use against attacks over the Internet. The Internet represents an insecure channel for exchanging information leading to a high risk of intrusion or fraud, such as phishing. Different methods have been used to protect the transfer of data, including encryption.

Types of security

Network layer security

TCP/IP which stands for Transmission Control Protocol (TCP) and Internet Protocol (IP) aka Internet protocol suite can be made secure with the help of cryptographic methods and protocols. These protocols include Secure Sockets Layer (SSL), succeeded by Transport Layer Security (TLS) for web traffic, Pretty Good Privacy (PGP) for email, and IPsec for the network layer security.

Internet Protocol Security (IPsec) 

This protocol is designed to protect communication in a secure manner using TCP/IP aka Internet protocol suite. It is a set of security extensions developed by the Internet Task force IETF, and it provides security and authentication at the IP layer by transforming data using encryption. Two main types of transformation that form the basis of IPsec: the Authentication Header (AH) and ESP. These two protocols provide data integrity, data origin authentication, and anti-replay service. These protocols can be used alone or in combination to provide the desired set of security services for the Internet Protocol (IP) layer.

The basic components of the IPsec security architecture are described in terms of the following functionalities:

  • Security protocols for AH and ESP
  • Security association for policy management and traffic processing
  • Manual and automatic key management for the Internet key exchange (IKE)
  • Algorithms for authentication and encryption

The set of security services provided at the IP layer includes access control, data origin integrity, protection against replays, and confidentiality. The algorithm allows these sets to work independently without affecting other parts of the implementation. The IPsec implementation is operated in a host or security gateway environment giving protection to IP traffic.

Security token

Some online sites offer customers the ability to use a six-digit code which randomly changes every 30–60 seconds on a security token. The keys on the security token have built in mathematical computations and manipulate numbers based on the current time built into the device. This means that every thirty seconds there is only a certain array of numbers possible which would be correct to validate access to the online account. The website that the user is logging into would be made aware of that devices’ serial number and would know the computation and correct time built into the device to verify that the number given is indeed one of the handful of six-digit numbers that works in that given 30-60 second cycle. After 30–60 seconds the device will present a new random six-digit number which can log into the website.

Electronic mail security (E-mail)

Background

Email messages are composed, delivered, and stored in a multiple step process, which starts with the message’s composition. When the user finishes composing the message and sends it, the message is transformed into a standard format: an RFC 2822 formatted message. Afterwards, the message can be transmitted. Using a network connection, the mail client, referred to as a mail user agent (MUA), connects to a mail transfer agent (MTA) operating on the mail server. The mail client then provides the sender’s identity to the server. Next, using the mail server commands, the client sends the recipient list to the mail server. The client then supplies the message. Once the mail server receives and processes the message, several events occur: recipient server identification, connection establishment, and message transmission. Using Domain Name System (DNS) services, the sender’s mail server determines the mail server(s) for the recipient(s). Then, the server opens up a connection(s) to the recipient mail server(s) and sends the message employing a process similar to that used by the originating client, delivering the message to the recipient(s).

Pretty Good Privacy (PGP)

Pretty Good Privacy provides confidentiality by encrypting messages to be transmitted or data files to be stored using an encryption algorithm such Triple DES or CAST-128. Email messages can be protected by using cryptography in various ways, such as the following:

  • Signing an email message to ensure its integrity and confirm the identity of its sender.
  • Encrypting the body of an email message to ensure its confidentiality.
  • Encrypting the communications between mail servers to protect the confidentiality of both message body and message header.

The first two methods, message signing and message body encryption, are often used together; however, encrypting the transmissions between mail servers is typically used only when two organizations want to protect emails regularly sent between each other. For example, the organizations could establish a virtual private network (VPN) to encrypt the communications between their mail servers over the Internet. Unlike methods that can only encrypt a message body, a VPN can encrypt entire messages, including email header information such as senders, recipients, and subjects. In some cases, organizations may need to protect header information. However, a VPN solution alone cannot provide a message signing mechanism, nor can it provide protection for email messages along the entire route from sender to recipient.

Multipurpose Internet Mail Extensions (MIME)

MIME transforms non-ASCII data at the sender’s site to Network Virtual Terminal (NVT) ASCII data and delivers it to client’s Simple Mail Transfer Protocol (SMTP) to be sent through the Internet. The server SMTP at the receiver’s side receives the NVT ASCII data and delivers it to MIME to be transformed back to the original non-ASCII data.

Message Authentication Code

A Message authentication code (MAC) is a cryptography method that uses a secret key to encrypt a message. This method outputs a MAC value that can be decrypted by the receiver, using the same secret key used by the sender. The Message Authentication Code protects both a message’s data integrity as well as its authenticity.

Firewalls

Photo taken at night of a large yellow coil that appears to be on fire.

A computer firewall controls access between networks. It generally consists of gateways and filters which vary from one firewall to another. Firewalls also screen network traffic and are able to block traffic that is dangerous. Firewalls act as the intermediate server between SMTP and Hypertext Transfer Protocol (HTTP) connections.

Role of firewalls in web security

Firewalls impose restrictions on incoming and outgoing Network packets to and from private networks. Incoming or outgoing traffic must pass through the firewall; only authorized traffic is allowed to pass through it. Firewalls create checkpoints between an internal private network and the public Internet, also known as choke points(borrowed from the identical military term of a combat limiting geographical feature). Firewalls can create choke points based on IP source and TCP port number. They can also serve as the platform for IPsec. Using tunnel mode capability, firewall can be used to implement VPNs. Firewalls can also limit network exposure by hiding the internal network system and information from the public Internet.

Types of firewall

Packet filter

A packet filter is a first generation firewall that processes network traffic on a packet-by-packet basis. Its main job is to filter traffic from a remote IP host, so a router is needed to connect the internal network to the Internet. The router is known as a screening router, which screens packets leaving and entering the network.

Stateful packet inspection

In a stateful firewall the circuit-level gateway is a proxy server that operates at the network level of an Open Systems Interconnection (OSI) model and statically defines what traffic will be allowed. Circuit proxies will forward Network packets (formatted unit of data ) containing a given port number, if the port is permitted by the algorithm. The main advantage of a proxy server is its ability to provide Network Address Translation (NAT), which can hide the user’s IP address from the Internet, effectively protecting all internal information from the Internet.

Application-level gateway

An application-level firewall is a third generation firewall where a proxy server operates at the very top of the OSI model, the IP suite application level. A network packet is forwarded only if a connection is established using a known protocol. Application-level gateways are notable for analyzing entire messages rather than individual packets of data when the data are being sent or received.

Malicious software

A computer user can be tricked or forced into downloading software onto a computer that is of malicious intent. Such software comes in many forms, such as viruses, Trojan horses, spyware, and worms.

  • Malware, short for malicious software, is any software used to disrupt computer operation, gather sensitive information, or gain access to private computer systems. Malware is defined by its malicious intent, acting against the requirements of the computer user, and does not include software that causes unintentional harm due to some deficiency. The term badware is sometimes used, and applied to both true (malicious) malware and unintentionally harmful software.
  • A botnet is a network of zombie computers that have been taken over by a robot or bot that performs large-scale malicious acts for the creator of the botnet.
  • Computer Viruses are programs that can replicate their structures or effects by infecting other files or structures on a computer. The common use of a virus is to take over a computer to steal data.
  • Computer worms are programs that can replicate themselves throughout a computer network, performing malicious tasks throughout.
  • Ransomware is a type of malware which restricts access to the computer system that it infects, and demands a ransom paid to the creator(s) of the malware in order for the restriction to be removed.
  • Scareware is scam software with malicious payloads, usually of limited or no benefit, that are sold to consumers via certain unethical marketing practices. The selling approach uses social engineering to cause shock, anxiety, or the perception of a threat, generally directed at an unsuspecting user.
  • Spyware refers to programs that surreptitiously monitor activity on a computer system and report that information to others without the user’s consent.
  • A Trojan horse, commonly known as a Trojan, is a general term for malicious software that pretends to be harmless, so that a user willingly allows it to be downloaded onto the computer.

Denial-of-service attack

A denial-of-service attack (DoS attack) or distributed denial-of-service attack (DDoS attack) is an attempt to make a computer resource unavailable to its intended users. Although the means to carry out, motives for, and targets of a DoS attack may vary, it generally consists of the concerted efforts to prevent an Internet site or service from functioning efficiently or at all, temporarily or indefinitely. According to businesses who participated in an international business security survey, 25% of respondents experienced a DoS attack in 2007 and 16.8% experienced one in 2010.

Phishing

1276202472_ce7e194cf2_o

Phishing is another common threat to the Internet. “SA, the Security Division of EMC, today announced the findings of its January 2013 Fraud Report, estimating the global losses from Phishing at $1.5 Billion in 2012.”. Filter evasion, website forgery, phone phishing, Covert Redirect are some well known phishing techniques.

Hackers use a variety of tools to conduct phishing attacks. They create forged websites that pretend to be other websites in order for users to leave their personal information. These hackers usually host these sites on legitimate hosting services using stolen credit cards while the last trend is to use a mailing system and finding a mailing list of people which they can try and fraud.

Browser choice

Web browser statistics tend to affect the amount a Web browser is exploited. For example, Internet Explorer 6, which used to own a majority of the Web browser market share, is considered extremely insecure because vulnerabilities were exploited due to its former popularity. Since browser choice is more evenly distributed (Internet Explorer at 28.5%, Firefox at 18.4%, Google Chrome at 40.8%, and so on) and vulnerabilities are exploited in many different browsers.

Application vulnerabilities

Applications used to access Internet resources may contain security vulnerabilities such as memory safety bugs or flawed authentication checks. The most severe of these bugs can give network attackers full control over the computer. Most security applications and suites are incapable of adequate defense against these kinds of attacks.

Internet security products

Antivirus

Antivirus and Internet security programs can protect a programmable device from malware by detecting and eliminating viruses; Antivirus software was mainly shareware in the early years of the Internet, but there are now several free security applications on the Internet to choose from for all platforms.

Security suites

So called “security suites” were first offered for sale in 2003 (McAfee) and contain a suite of firewalls, anti-virus, anti-spyware and more. They may now offer theft protection, portable storage device safety check, private Internet browsing, cloud anti-spam, a file shredder or make security-related decisions (answering popup windows) and several were free of charge as of at least 2012.

 

 

INTRODUCTION TO COMPUTER 

Computer is an electronic device that receives input, stores or processes the input as per user instructions and provides output in desired format

 

The processes that can be applied to data are of two types −

  • Arithmetic operations − Examples include calculations like addition, subtraction, differentials, square root, etc.

  • Logical operations − Examples include comparison operations like greater than, less than, equal to, opposite, etc.

The basic parts of a computer are as follows −

  • Input Unit − Devices like keyboard and mouse that are used to input data and instructions to the computer are called input unit.

  • Output Unit − Devices like printer and visual display unit that are used to provide information to the user in desired format are called output unit.

  • Control Unit − As the name suggests, this unit controls all the functions of the computer. All devices or parts of computer interact through the control unit.

  • Arithmetic Logic Unit − This is the brain of the computer where all arithmetic operations and logical operations take place.

  • Memory − All input data, instructions and data interim to the processes are stored in the memory. Memory is of two types – primary memory and secondary memory. Primary memory resides within the CPU whereas secondary memory is external to it.

Control unit, arithmetic logic unit and memory are together called the central processing unit or CPU. Computer devices like keyboard, mouse, printer, etc. that we can see and touch are the hardware components of a computer. The set of instructions or programs that make the computer function using these hardware parts are called software. We cannot see or touch software. Both hardware and software are necessary for working of a computer.

 

Characteristics of Computer

To understand why computers are such an important part of our lives, let us look at some of its characteristics −

  • Speed − Typically, a computer can carry out 3-4 million instructions per second.

  • Accuracy − Computers exhibit a very high degree of accuracy. Errors that may occur are usually due to inaccurate data, wrong instructions or bug in chips – all human errors.

  • Reliability − Computers can carry out same type of work repeatedly without throwing up errors due to tiredness or boredom, which are very common among humans.

  • Versatility − Computers can carry out a wide range of work from data entry and ticket booking to complex mathematical calculations and continuous astronomical observations. If you can input the necessary data with correct instructions, computer will do the processing.

  • Storage Capacity − Computers can store a very large amount of data at a fraction of cost of traditional storage of files. Also, data is safe from normal wear and tear associated with paper.

Advantages of Using Computer

Now that we know the characteristics of computers, we can see the advantages that computers offer−

  • Computers can do the same task repetitively with same accuracy.

  • Computers do not get tired or bored.

  • Computers can take up routine tasks while releasing human resource for more intelligent functions.

Disadvantages of Using Computer

Despite so many advantages, computers have some disadvantages of their own −

  • Computers have no intelligence; they follow the instructions blindly without considering the outcome.

  • Regular electric supply is necessary to make computers work, which could prove difficult everywhere especially in developing nations.

Booting

Starting a computer or a computer-embedded device is called booting. Booting takes place in two steps −

  • Switching on power supply
  • Loading operating system into computer’s main memory
  • Keeping all applications in a state of readiness in case needed by the user

The first program or set of instructions that run when the computer is switched on is called BIOS or Basic Input Output System. BIOS is a firmware, i.e. a piece of software permanently programmed into the hardware.

If a system is already running but needs to be restarted, it is called rebooting. Rebooting may be required if a software or hardware has been installed or system is unusually slow.

There are two types of booting −

  • Cold Booting − When the system is started by switching on the power supply it is called cold booting. The next step in cold booting is loading of BIOS.

  • Warm Booting − When the system is already running and needs to be restarted or rebooted, it is called warm booting. Warm booting is faster than cold booting because BIOS is not reloaded.

 

Historically computers were classified according to processor types because development in processor and processing speeds were the developmental benchmarks. Earliest computers used vacuum tubes for processing, were huge and broke down frequently. However, as vacuum tubes were replaced by transistors and then chips, their size decreased and processing speeds increased manifold.

All modern computers and computing devices use microprocessors whose speeds and storage capacities are skyrocketing day by day. The developmental benchmark for computers is now their size. Computers are now classified on the basis of their use or size −

  • Desktop
  • Laptop
  • Tablet
  • Server
  • Mainframe
  • Supercomputer

Let us look at all these types of computers in detail.

Desktop

Desktop computers are personal computers (PCs) designed for use by an individual at a fixed location. IBM was the first computer to introduce and popularize use of desktops. A desktop unit typically has a CPU (Central Processing Unit), monitor, keyboard and mouse. Introduction of desktops popularized use of computers among common people as it was compact and affordable.

 

Riding on the wave of desktop’s popularity many software and hardware devices were developed specially for the home or office user. The foremost design consideration here was user friendliness.

Laptop

Despite its huge popularity, desktops gave way to a more compact and portable personal computer called laptop in 2000s. Laptops are also called notebook computers or simply notebooks. Laptops run using batteries and connect to networks using Wi-Fi (Wireless Fidelity) chips. They also have chips for energy efficiency so that they can conserve power whenever possible and have a longer life.

 

Modern laptops have enough processing power and storage capacity to be used for all office work, website designing, software development and even audio/video editing.

Tablet

After laptops computers were further miniaturized to develop machines that have processing power of a desktop but are small enough to be held in one’s palm. Tablets have touch sensitive screen of typically 5 to 10 inches where one finger is used to touch icons and invoke applications.

 

Keyboard is also displayed virtually whenever required and used with touch strokes. Applications that run on tablets are called apps. They use operating systems by Microsoft (Windows 8 and later versions) or Google (Android). Apple computers have developed their own tablet called iPad which uses a proprietary OS called iOS.

Server

Servers are computers with high processing speeds that provide one or more services to other systems on the network. They may or may not have screens attached to them. A group of computers or digital devices connected together to share resources is called a network.

 

Servers have high processing powers and can handle multiple requests simultaneously. Most commonly found servers on networks include −

  • File or storage server
  • Game server
  • Application server
  • Database server
  • Mail server
  • Print server

Mainframe

Mainframes are computers used by organizations like banks, airlines and railways to handle millions and trillions of online transactions per second. Important features of mainframes are −

  • Big in size
  • Hundreds times Faster than servers, typically hundred megabytes per second
  • Very expensive
  • Use proprietary OS provided by the manufacturers
  • In-built hardware, software and firmware security features

Supercomputer

Supercomputers are the fastest computers on Earth. They are used for carrying out complex, fast and time intensive calculations for scientific and engineering applications. Supercomputer speed or performance is measured in teraflops, i.e. 1012 floating point operations per second.

 Supercomputer Sunway TaihuLight is the world’s fastest supercomputer with a rating of 93 petaflops per second, i.e. 93 quadrillion floating point operations per second.

Most common uses of supercomputers include −

  • Molecular mapping and research
  • Weather forecasting
  • Environmental research
  • Oil and gas exploration

Basics of Computers - Software Concepts

As you know, the hardware devices need user instructions to function. A set of instructions that achieve a single outcome are called program or procedure. Many programs functioning together to do a task make a software.

For example, a word-processing software enables the user to create, edit and save documents. A web browser enables the user to view and share web pages and multimedia files. There are two categories of software −

  • System Software
  • Application Software
  • Utility Software

Let us discuss them in detail

 

System Software

Software required to run the hardware parts of the computer and other application software are called system software. System software acts as interface between hardware and user applications. An interface is needed because hardware devices or machines and humans speak in different languages.

Machines understand only binary language i.e. 0 (absence of electric signal) and 1 (presence of electric signal) while humans speak in English, French, German, Tamil, Hindi and many other languages. English is the pre-dominant language of interacting with computers. Software is required to convert all human instructions into machine understandable instructions. And this is exactly what system software does.

Based on its function, system software is of four types −

  • Operating System
  • Language Processor
  • Device Drivers

Operating System

System software that is responsible for functioning of all hardware parts and their interoperability to carry out tasks successfully is called operating system (OS). OS is the first software to be loaded into computer memory when the computer is switched on and this is called booting. OS manages a computer’s basic functions like storing data in memory, retrieving files from storage devices, scheduling tasks based on priority, etc.

Language Processor

As discussed earlier, an important function of system software is to convert all user instructions into machine understandable language. When we talk of human machine interactions, languages are of three types −

  • Machine-level language − This language is nothing but a string of 0s and 1s that the machines can understand. It is completely machine dependent.

  • Assembly-level language − This language introduces a layer of abstraction by defining mnemonics. Mnemonics are English like words or symbols used to denote a long string of 0s and 1s. For example, the word “READ” can be defined to mean that computer has to retrieve data from the memory. The complete instruction will also tell the memory address. Assembly level language is machine dependent.

  • High level language − This language uses English like statements and is completely independent of machines. Programs written using high level languages are easy to create, read and understand.

Program written in high level programming languages like Java, C++, etc. is called source code. Set of instructions in machine readable form is called object code or machine code. System software that converts source code to object code is called language processor. There are three types of language interpreters−

  • Assembler − Converts assembly level program into machine level program.

  • Interpreter − Converts high level programs into machine level program line by line.

  • Compiler − Converts high level programs into machine level programs at one go rather than line by line.

Device Drivers

System software that controls and monitors functioning of a specific device on computer is called device driver. Each device like printer, scanner, microphone, speaker, etc. that needs to be attached externally to the system has a specific driver associated with it. When you attach a new device, you need to install its driver so that the OS knows how it needs to be managed.

Application Software

A software that performs a single task and nothing else is called application software. Application software are very specialized in their function and approach to solving a problem. So a spreadsheet software can only do operations with numbers and nothing else. A hospital management software will manage hospital activities and nothing else. Here are some commonly used application software −

  • Word processing
  • Spreadsheet
  • Presentation
  • Database management
  • Multimedia tools

Utility Software

Application software that assist system software in doing their work is called utility software. Thus utility software is actually a cross between system software and application software. Examples of utility software include −

  • Antivirus software
  • Disk management tools
  • File management tools
  • Compression tools
  • Backup tools

 Basics of Computers - System S/W

 

As you know, system software acts as an interface for the underlying hardware system. Here we will discuss some important system software in detail.

System Structure

Operating System

Operating system (OS) is the lifeline of computer. You connect all the basic devices like CPU, monitor, keyboard and mouse; plug in the power supply and switch it on thinking you have everything in place. But the computer will not start or come to life unless it has an operating system installed in it because OS −

  • Keeps all hardware parts in a state of readiness to follow user instructions
  • Co-ordinates between different devices
  • Schedules multiple tasks as per priority
  • Allocates resource to each task
  • Enables computer to access network
  • Enables users to access and use application software

Besides initial booting, these are some of the functions of an operating system −

  • Managing computer resources like hardware, software, shared resources, etc.
  • Allocating resources
  • Prevent error during software use
  • Control improper use of computer

One of the earliest operating systems was MS-DOS, developed by Microsoft for IBM PC. It was a Command Line Interface (CLI) OS that revolutionized the PC market. DOS was difficult to use because of its interface. The users needed to remember instructions to do their tasks. To make computers more accessible and user-friendly, Microsoft developed Graphical User Interface (GUI) based OS called Windows, which transformed the way people used computers.

Assembler

Assembler is a system software that converts assembly level programs to machine level code.

Assembler

These are the advantages provided by assembly level programming −

  • Increases efficiency of the programmer as remembering mnemonics is easier
  • Productivity increases as number of errors decreases and hence debugging time
  • Programmer has access to hardware resources and hence has flexibility in writing programs customized to the specific computer

Interpreter

The major advantage of assembly level language was its ability to optimize memory usage and hardware utilization. However, with technological advancements computers had more memory and better hardware components. So ease of writing programs became more important than optimizing memory and other hardware resources.

In addition, a need was felt to take programming out of a handful of trained scientists and computer programmers, so that computers could be used in more areas. This led to development of high level languages that were easy to understand due to resemblance of commands to English language.

The system software used to translate high level language source code into machine level language object code line by line is called an interpreter. An interpreter takes each line of code and converts it into machine code and stores it into the object file.

The advantage of using an interpreter is that they are very easy to write and they do not require a large memory space. However, there is a major disadvantage in using interpreters, i.e., interpreted programs take a long time in executing. To overcome this disadvantage, especially for large programs, compilers were developed.

Compiler

System software that store the complete program, scan it, translate the complete program into object code and then creates an executable code is called a compiler. On the face of it compilers compare unfavorably with interpreters because they −

  • are more complex than interpreters
  • need more memory space
  • take more time in compiling source code

However, compiled programs execute very fast on computers. The following image shows the step-by-step process of how a source code is transformed into an executable code −

Compiler

These are the steps in compiling source code into executable code −

  • Pre-processing − In this stage pre-processor instructions, typically used by languages like C and C++ are interpreted, i.e. converted to assembly level language.

  • Lexical analysis − Here all instructions are converted to lexical units like constants, variables, arithmetic symbols, etc.

  • Parsing − Here all instructions are checked to see if they conform to grammar rules of the language. If there are errors, compiler will ask you to fix them before you can proceed.

  • Compiling − At this stage the source code is converted into object code.

  • Linking − If there are any links to external files or libraries, addresses of their executable will be added to the program. Also, if the code needs to be rearranged for actual execution, they will be rearranged. The final output is the executable code that is ready to be executed.

 Basics of Computers - Functions of OS

As you know, operating system is responsible for functioning of the computer system. To do that it carries out these three broad categories of activities −

  • Essential functions − Ensures optimum and effective utilization of resources

  • Monitoring functions − Monitors and collects information related to system performance

  • Service functions − Provides services to users

Let us look at some of the most important functions associated with these activities.

Processor management

Managing a computer’s CPU to ensure its optimum utilization is called processor management. Managing processor basically involves allocating processor time to the tasks that need to be completed. This is called job scheduling. Jobs must be scheduled in such a way that −

  • There is maximum utilization of CPU
  • Turnaround time, i.e. time required to complete each job, is minimum
  • Waiting time is minimum
  • Each job gets the fastest possible response time
  • Maximum throughput is achieved, where throughput is the average time taken to complete each task

There are two methods of job scheduling done by operating systems −

  • Preemptive scheduling
  • Non-Preemptive scheduling

Processor Management

Preemptive Scheduling

In this type of scheduling, next job to be done by the processor can be scheduled before the current job completes. If a job of higher priority comes up, the processor can be forced to release the current job and take up the next job. There are two scheduling techniques that use pre-emptive scheduling −

  • Round robin scheduling − A small unit of time called time slice is defined and each program gets only one time slice at a time. If it is not completed during that time, it must join the job queue at the end and wait till all programs have got one time slice. The advantage here is that all programs get equal opportunity. The downside is that if a program completes execution before the time slice is over, CPU is idle for the rest of the duration.

  • Response ratio scheduling − Response ratio is defined as

    $$\frac{Elapsed \: Time}{Execution \: time \: received}$$

    A job with shorter response time gets higher priority. So a larger program may have to wait even if it was requested earlier than the shorter program. This improves throughput of the CPU.

Non-preemptive Scheduling

In this type of scheduling, job scheduling decisions are taken only after the current job completes. A job is never interrupted to give precedence to higher priority jobs. Scheduling techniques that use non-preemptive scheduling are −

  • First come first serve scheduling − This is the simplest technique where the first program to throw up a request is completed first.

  • Shortest job next scheduling − Here the job that needs least amount of time for execution is scheduled next.

  • Deadline scheduling − The job with the earliest deadline is scheduled for execution next.

Memory Management

Process of regulating computer memory and using optimization techniques to enhance overall system performance is called memory management. Memory space is very important in modern computing environment, so memory management is an important role of operating systems.

As you know, computers have two types of memory – primary and secondary. Primary memory is fast but expensive and secondary memory is cheap but slower. OS has to strike a balance between the two to ensure that system performance is not hurt due to very less primary memory or system costs do not shoot up due to too much primary memory.

Input and output data, user instructions and data interim to program execution need to be stored, accessed and retrieved efficiently for high system performance. Once a program request is accepted, OS allocates it primary and secondary storage areas as per requirement. Once execution is completed, the memory space allocated to it is freed. OS uses many storage management techniques to keep a track of all storage spaces that are allocated or free.

Contiguous Storage Allocation

This is the simplest storage space allocation technique where contiguous memory locations are assigned to each program. OS has to estimate the amount of memory required for the complete process before allocation.

Non-contiguous Storage Allocation

As the name suggests, program and associated data need not be stored in contiguous locations. The program is divided into smaller components and each component is stored in a separate location. A table keeps a record of where each component of the program is stored. When the processor needs to access any component, OS provides access using this allocation table.

In a real-life scenario primary memory space might not be sufficient to store the whole program. In that case, OS takes the help of Virtual Storage technique, where program is physically stored in secondary memory but appears to be stored in primary memory. This introduces a miniscule time lag in accessing the program components. There are two approaches to virtual storages −

  • Program paging − A program is broken down into fixed size page and stored in the secondary memory. The pages are given logical address or virtual address from 0 to n. A page table maps the logical addresses to the physical addresses, which is used to retrieve the pages when required.

  • Program segmentation − A program is broken down into logical units called segments, assigned logical address from 0 to n and stored in secondary memory. A segment table is used to load segments from secondary memory to primary memory.

Operating systems typically use a combination of page and program segmentation to optimize memory usage. A large program segment may be broken into pages or more than one small segments may be stored as a single page.

File Management

Data and information is stored on computers in form of files. Managing file system to enable users to keep their data safely and correctly is an important function of operating systems. Managing file systems by OS is called file management. File management is required to provide tools for these file related activities −

  • Creating new files for storing data
  • Updating
  • Sharing
  • Securing data through passwords and encryption
  • Recovery in case of system failure

Device Management

The process of implementation, operation and maintenance of a device by operating system is called device management. Operating system uses a utility software called device driver as interface to the device.

When many processes access the devices or request access to the devices, the OS manages the devices in a way that efficiently shares the devices among all processes. Processes access devices through system call interface, a programming interface provided by the OS.

 

As computers and computing technologies have evolved over the years, so have their usage across many fields. To meet growing requirements more and more customized software have flooded the market. As every software needs operating system to function, operating systems have also evolved over the years to meet growing demand on their techniques and capabilities. Here we discuss some common types of OS based on their working techniques and some popularly used OS as well.

GUI OS

GUI is the acronym for Graphical User Interface. An operating system that presents an interface comprising graphics and icons is called a GUI OS. GUI OS is very easy to navigate and use as users need not remember commands to be given to accomplish each task. Examples of GUI OS includes Windows, macOS, Ubuntu, etc.

Time Sharing OS

Operating systems that schedule tasks for efficient processor use are called time sharing OS. Time sharing, or multitasking, is used by operating systems when multiple users located at different terminals need processor time to complete their tasks. Many scheduling techniques like round robin scheduling and shortest job next scheduling are used by time sharing OS.

Real Time OS

An operating system that guarantees to process live events or data and deliver the results within a stipulated span of time is called a real time OS. It may be single tasking or multitasking.

Distributed OS

An operating system that manages many computers but presents an interface of single computer to the user is called distributed OS. Such type of OS is required when computational requirements cannot be met by a single computer and more systems have to be used. User interaction is restricted to a single system; it’s the OS that distributed work to multiple systems and then presents the consolidated output as if one computer has worked on the problem at hand.

Popular Operating Systems

Initially computers had no operating systems. Every program needed full hardware specifications to run correctly as processor, memory and device management had to be done by the programs themselves. However, as sophisticated hardware and more complex application programs developed, operating systems became essential. As personal computers became popular among individuals and small businesses, demand for standard operating system grew. Let us look at some of the currently popular operating systems −

  • Windows − Windows is a GUI operating system first developed by Microsoft in 1985. The latest version of Windows is Windows 10. Windows is used by almost 88% of PCs and laptops globally.

  • Linux − Linux is an open source operating system mostly used by mainframes an supercomputers. Being open source means that its code is available for free and anyone can develop a new OS based on it.

  • BOSS − Bharat Operating System Solutions is an Indian distribution of Linux based on Debian, an OS. It is localized to enable use of local Indian languages. BOSS consists of −

    • Linux kernel
    • Office application suite BharteeyaOO
    • Web browser
    • Email service Thunderbird
    • Chat application Pidgim
    • File sharing applications
    • Multimedia applications

Mobile OS

An operating system for smartphones, tablets and other mobile devices is called mobile OS. Some of the most popular OS for mobile devices includes−

  • Android − This Linux-based OS by Google is the most popular mobile OS currently. Almost 85% of mobile devices use it.

  • Windows Phone 7 − It is the latest mobile OS developed by Microsoft.

  • Apple iOS − This mobile OS is an OS developed by Apple exclusively for its own mobile devices like iPhone, iPad, etc.

  • Blackberry OS − This is the OS used by all blackberry mobile devices like smartphones and playbooks

Basics of Computers - Utility Software

Application software that assist OS in carrying out certain specialized tasks are called utility software. Let us look some of the most popular utility software.

Antivirus

A virus can be defined as a malicious program that attaches itself to a host program and makes multiple copies of itself, slowing down, corrupting or destroying the system. A software that assists the OS in providing virus free environment to the users is called antivirus. An anti-virus scans the system for any virus and if detected, gets rid of it by deleting or isolating it. It can detect many types of virus like boot virus, Trojan, worm, spyware, etc.

When any external storage device like USB drive is attached to the system, anti-virus software scans it and gives an alert if a virus is detected. You can set up your system for periodic scans or scan whenever you feel the need. A combination of both the techniques is advisable to keep your system virus free.

File management tools

As you know, file management is an important function of operating systems as all data and instructions are stored in the computer in form of files. Utility software providing regular file management tasks like browse, search, update, preview, etc. are called file management tools. Windows Explorer in Windows OS, Google desktop, Directory Opus, Double Commander, etc. are examples of such tools.

Compression tools

Storage space is always at a premium in computer systems. So operating systems are always looking at ways to minimize amount of storage space taken by files. Compression tools are utilities that assist operating systems in shortening files so that they take less space. After compression files are stored in a different format and cannot be read or edited directly. It needs to be uncompressed before it can be accessed for further use. Some of the popular compression tools are WinRAR, PeaZip, The Unarchiver, etc.

Disk Cleanup

Disk cleanup tools assist users in freeing up disk space. The software scans hard disks to find files that are no longer used and frees up space by deleting them.

Disk Defragmenter

Disk defragmenter is a disk management utility that increases file access speeds by rearranging fragmented files on contiguous locations. Large files are broken down into fragments and may be stores in non-contiguous locations if contiguous ones are not available. When such files are accessed by the user, access speed is slow due to fragmentation. Disk defragmenter utility scans the hard disk and tries to assemble file fragments so that they may be stored in contiguous locations.

Backup

Backup utility enables backing up of files, folders, databases or complete disks. Backups are taken so that data may be restored in case of data loss. Backup is a service provided by all operating systems. In stand-alone systems backup may be taken in the same or different drive. In case of networked systems backup may be done on backup servers

Basics of Computers - Open Source Software

A software whose source code is freely distributed with a license to study, change and further distributed to anyone for any purpose is called open source software. Open source software is generally a team effort where dedicated programmers improve upon the source code and share the changes within the community. Open source software provides these advantages to the users due to its thriving communities −

  • Security
  • Affordability
  • Transparent
  • Interoperable on multiple platforms
  • Flexible due to customizations
  • Localization is possible

Freeware

A software that is available free of cost for use and distribution but cannot be modified as its source code is not available is called freeware. Examples of freeware are Google Chrome, Adobe Acrobat PDF Reader, Skype, etc.

Shareware

A software that is initially free and can be distributed to others as well, but needs to be paid for after a stipulated period of time is called shareware. Its source code is also not available and hence cannot be modified.

Proprietary Software

Software that can be used only by obtaining license from its developer after paying for it is called proprietary software. An individual or a company can own such proprietary software. Its source code is often closely guarded secret and it can have major restrictions like −

  • No further distribution
  • Number of users that can use it
  • Type of computer it can be installed on, example multitasking or single user, etc.

For example, Microsoft Windows is a proprietary operating software that comes in many editions for different types of clients like single-user, multi-user, professional, etc.

 Basics of Computers - Office Tools

Application software that assist users in regular office jobs like creating, updating and maintaining documents, handling large amounts of data, creating presentations, scheduling, etc. are called office tools. Using office tools saves time and effort and lots of repetitive tasks can be done easily. Some of the software that do this are −

  • Word processors
  • Spreadsheets
  • Database systems
  • Presentation software
  • E-mail tools

Let us look at some of these in detail.

Word Processor

A software for creating, storing and manipulating text documents is called word processor. Some common word processors are MS-Word, WordPad, WordPerfect, Google docs, etc.

Word Processor

A word processor allows you to −

  • Create, save and edit documents
  • Format text properties like font, alignment, font color, background color, etc.
  • Check spelling and grammar
  • Add images
  • Add header and footer, set page margins and insert watermarks

Spreadsheet

Spreadsheet is a software that assists users in processing and analyzing tabular data. It is a computerized accounting tool. Data is always entered in a cell (intersection of a row and a column) and formulas and functions to process a group of cells is easily available. Some of the popular spreadsheet software include MS-Excel, Gnumeric, Google Sheets, etc. Here is a list of activities that can be done within a spreadsheet software −

  • Simple calculations like addition, average, counting, etc.
  • Preparing charts and graphs on a group of related data
  • Data entry
  • Data formatting
  • Cell formatting
  • Calculations based on logical comparisons

Spreadsheet

Presentation Tool

Presentation tool enables user to demonstrate information broken down into small chunks and arranged on pages called slides. A series of slides that present a coherent idea to an audience is called a presentation. The slides can have text, images, tables, audio, video or other multimedia information arranged on them. MS-PowerPoint, OpenOffice Impress, Lotus Freelance, etc. are some popular presentation tools.

Presentation Tool

Database Management System

Software that manages storage, updating and retrieval of data by creating databases is called database management system. Some popular database management tools are MS-Access, MySQL, Oracle, FoxPro, etc.

Database Management System

 

 

Computer memory is measured in terms of how many bits it can store. Here is a chart for memory capacity conversion.

  • 1 byte (B) = 8 bits
  • 1 Kilobytes (KB) = 1024 bytes
  • 1 Megabyte (MB) = 1024 KB
  • 1 Gigabyte (GB) = 1024 MB
  • 1 Terabyte (TB) = 1024 GB
  • 1 Exabyte (EB) = 1024 PB
  • 1 Zettabyte = 1024 EB
  • 1 Yottabyte (YB) = 1024 ZB

Octal Number System

Octal number system has eight digits – 0, 1, 2, 3, 4, 5, 6 and 7. Octal number system is also a positional value system with where each digit has its value expressed in powers of 8, as shown here −

Octal Number System

Decimal equivalent of any octal number is sum of product of each digit with its positional value.

7268 = 7×82 + 2×81 + 6×80

= 448 + 16 + 6

= 47010

Hexadecimal Number System

Octal number system has 16 symbols – 0 to 9 and A to F where A is equal to 10, B is equal to 11 and so on till F. Hexadecimal number system is also a positional value system with where each digit has its value expressed in powers of 16, as shown here −

Hexa Number System

Decimal equivalent of any hexadecimal number is sum of product of each digit with its positional value.

27FB16 = 2×163 + 7×162 + 15×161 + 10×160

= 8192 + 1792 + 240 +10

= 1023410

Number System Relationship

The following table depicts the relationship between decimal, binary, octal and hexadecimal number systems.

 

HEXADECIMALDECIMALOCTALBINARY
0000000
1110001
2220010
3330011
4440100
5550101
6660110
7770111
88101000
99111001
A10121010
B11131011
C12141100
D13151101
E14161110
F15171111

 

ASCII

Besides numerical data, computer must be able to handle alphabets, punctuation marks, mathematical operators, special symbols, etc. that form the complete character set of English language. The complete set of characters or symbols are called alphanumeric codes. The complete alphanumeric code typically includes −

  • 26 upper case letters
  • 26 lower case letters
  • 10 digits
  • 7 punctuation marks
  • 20 to 40 special characters

Now a computer understands only numeric values, whatever the number system used. So all characters must have a numeric equivalent called the alphanumeric code. The most widely used alphanumeric code is American Standard Code for Information Interchange (ASCII). ASCII is a 7-bit code that has 128 (27) possible codes.

ASCII Code

ISCII

ISCII stands for Indian Script Code for Information Interchange. IISCII was developed to support Indian languages on computer. Language supported by IISCI include Devanagari, Tamil, Bangla, Gujarati, Gurmukhi, Tamil, Telugu, etc. IISCI is mostly used by government departments and before it could catch on, a new universal encoding standard called Unicode was introduced.

Unicode

Unicode is an international coding system designed to be used with different language scripts. Each character or symbol is assigned a unique numeric value, largely within the framework of ASCII. Earlier, each script had its own encoding system, which could conflict with each other.

In contrast, this is what Unicode officially aims to do − Unicode provides a unique number for every character, no matter what the platform, no matter what the program, no matter what the language.

 Microprocessor Concepts

Microprocessor is the brain of computer, which does all the work. It is a computer processor that incorporates all the functions of CPU (Central Processing Unit) on a single IC (Integrated Circuit) or at the most a few ICs. Microprocessors were first introduced in early 1970s. 4004 was the first general purpose microprocessor used by Intel in building personal computers. Arrival of low cost general purpose microprocessors has been instrumental in development of modern society the way it has.

Microprocessor

We will study the characteristics and components of a microprocessor in detail.

Microprocessors Characteristics

Microprocessors are multipurpose devices that can be designed for generic or specialized functions. The microprocessors of laptops and smartphones are general purpose whereas ones designed for graphical processing or machine vision are specialized ones. There are some characteristics that are common to all microprocessors.

These are the most important defining characteristics of a microprocessor −

  • Clock speed
  • Instruction set
  • Word size

Clock Speed

Every microprocessor has an internal clock that regulates the speed at which it executes instructions and also synchronizes it with other components. The speed at which the microprocessor executes instructions is called clock speed. Clock speeds are measured in MHz or GHz where 1 MHz means 1 million cycles per second whereas 1 GHz equals to 1 billion cycles per second. Here cycle refers to single electric signal cycle.

Currently microprocessors have clock speed in the range of 3 GHz, which is maximum that current technology can attain. Speeds more than this generate enough heat to damage the chip itself. To overcome this, manufacturers are using multiple processors working in parallel on a chip.

Word Size

Number of bits that can be processed by a processor in a single instruction is called its word size. Word size determines the amount of RAM that can be accessed at one go and total number of pins on the microprocessor. Total number of input and output pins in turn determines the architecture of the microprocessor.

First commercial microprocessor Intel 4004 was a 4-bit processor. It had 4 input pins and 4 output pins. Number of output pins is always equal to the number of input pins. Currently most microprocessors use 32-bit or 64-bit architecture.

Instruction Set

A command given to a digital machine to perform an operation on a piece of data is called an instruction. Basic set of machine level instructions that a microprocessor is designed to execute is called its instruction set. These instructions do carry out these types of operations −

  • Data transfer
  • Arithmetic operations
  • Logical operations
  • Control flow
  • Input/output and machine control

Microprocessor Components

Compared to the first microprocessors, today’s processors are very small but still they have these basic parts right from the first model −

  • CPU
  • Bus
  • Memory

CPU

CPU is fabricated as a very large scale integrated circuit (VLSI) and has these parts −

  • Instruction register − It holds the instruction to be executed.

  • Decoder − It decodes (converts to machine level language) the instruction and sends to the ALU (Arithmetic Logic Unit).

  • ALU − It has necessary circuits to perform arithmetic, logical, memory, register and program sequencing operations.

  • Register − It holds intermediate results obtained during program processing. Registers are used for holding such results rather than RAM because accessing registers is almost 10 times faster than accessing RAM.

Bus

Connection lines used to connect the internal parts of the microprocessor chip is called bus. There are three types of buses in a microprocessor −

  • Data Bus − Lines that carry data to and from memory are called data bus. It is a bidirectional bus with width equal to word length of the microprocessor.

  • Address Bus − It is a unidirectional responsible for carrying address of a memory location or I/O port from CPU to memory or I/O port.

  • Control Bus − Lines that carry control signals like clock signals, interrupt signal or ready signal are called control bus. They are bidirectional. Signal that denotes that a device is ready for processing is called ready signal. Signal that indicates to a device to interrupt its process is called an interrupt signal.

Memory

Microprocessor has two types of memory

  • RAM − Random Access Memory is volatile memory that gets erased when power is switched off. All data and instructions are stored in RAM.

  • ROM − Read Only Memory is non-volatile memory whose data remains intact even after power is switched off. Microprocessor can read from it any time it wants but cannot write to it. It is preprogrammed with most essential data like booting sequence by the manufacturer.

 Evaluation of Microprocessor

The first microprocessor introduced in 1971 was a 4-bit microprocessor with 4m5KB memory and had a set of 45 instructions. In the past 5 decades microprocessor speed has doubled every two years, as predicted by Gordon Moore, Intel co-founder. Current microprocessors can access 64 GB memory. Depending on width of data microprocessors can process, they are of these categories−

  • 8-bit
  • 16-bit
  • 32-bit
  • 64-bit

Size of instruction set is another important consideration while categorizing microprocessors. Initially, microprocessors had very small instructions sets because complex hardware was expensive as well as difficult to build.

As technology developed to overcome these issues, more and more complex instructions were added to increase functionality of the microprocessor. However, soon it was realized that having large instruction sets was counterproductive as many instructions that were rarely used sat idle on precious memory space. So the old school of thought that supported smaller instruction sets gained popularity.

Let us learn more about the two types of microprocessors based on their instruction set.

RISC

RISC stands for Reduced Instruction Set Computers. It has a small set of highly optimized instructions. Complex instruction are also implemented using simpler instructions, reducing the size of instruction set. The designing philosophy for RISC incorporates these salient points −

  • Number of instructions should be minimum.
  • Instructions should be of same length.
  • Simple addressing modes should be used
  • Reduce memory references to retrieve operands by adding registers

Some of the techniques used by RISC architecture include −

  • Pipelining− A sequence of instructions is fetched even if it means overlapping of instructions in fetching and execution.

  • Single cycle execution − Most of RISC instructions take one CPU cycle to execute.

Examples of RISC processors are Intel P6, Pentium4, AMD K6 and K7, etc.

CISC

CISC stands for Complex Instruction Set Computers. It supports hundreds of instructions. Computers supporting CISC can accomplish wide variety of tasks, making them ideal for personal computers. These are some characteristics of CISC architecture −

  • Larger set of instructions
  • Instructions are of variable length
  • Complex addressing modes
  • Instructions take more than one clock cycle
  • Work well with simpler compilers

Examples of CISC processors are Intel 386 & 486, Pentium, Pentium II and III, Motorola 68000, etc.

EPIC

EPIC stands for Explicitly Parallel Instruction Computing. It is a computer architecture that is a cross between RISC and CISC, trying to provide the best of both. Its important features include −

  • Parallel instructions rather than fixed width
  • Mechanism to communication compiler’s execution plan to hardware
  • Programs must have sequential semantics

Some EPIC processors are Intel IA-64, Itanium, etc.

Basics of Computers - Primary Memory

Memory is required in computers to store data and instructions. Memory is physically organized as a large number of cells that are capable of storing one bit each. Logically they are organized as groups of bits called words that are assigned an address. Data and instructions are accessed through these memory address. The speed with which these memory addresses can be accessed determines the cost of the memory. Faster the memory speed, higher the price.

Computer memory can be said to be organized in a hierarchical way where memory with the fastest access speeds and highest costs lies at the top whereas those with lowest speeds and hence lowest costs lie at the bottom. Based on this criteria memory is of two types – primary and secondary. Here we will look at primary memory in detail.

The main features of primary memory, which distinguish it from secondary memory are −

  • It is accessed directly by the processor
  • It is the fastest memory available
  • Each word is stored as well as
  • It is volatile, i.e. its contents are lost once power is switched off

As primary memory is expensive, technologies are developed to optimize its use. These are broad types of primary memory available.

Primary Memory

RAM

RAM stands for Random Access Memory. The processor accesses all memory addresses directly, irrespective of word length, making storage and retrieval fast. RAM is the fastest memory available and hence most expensive. These two factors imply that RAM is available in very small quantities of up to 1GB. RAM is volatile but my be of any of these two types

DRAM (Dynamic RAM)

Each memory cell in a DRAM is made of one transistor and one capacitor, which store one bit of data. However, this cell starts losing its charge and hence data stored in less than thousandth of a second. So it needs to be refreshed thousand times a second, which takes up processor time. However, due to small size of each cell, one DRAM can have large number of cells. Primary memory of most of the personal computers is made of DRAM.

SRAM (SRAM)

Each cell in SRAM is made of a flip flop that stores one bit. It retains its bit till the power supply is on and doesn’t need to be refreshed like DRAM. It also has shorter read-write cycles as compared to DRAM. SRAM is used in specialized applications.

ROM

ROM stands for Read Only Memory. As the name suggests, ROM can only be read by the processor. New data cannot be written into ROM. Data to be stored into ROM is written during the manufacturing phase itself. They contain data that does not need to be altered, like booting sequence of a computer or algorithmic tables for mathematical applications. ROM is slower and hence cheaper than RAM. It retains its data even when power is switched off, i.e. it is non-volatile. ROM cannot be altered the way RAM can be but technologies are available to program these types of ROMs −

PROM (Programmable ROM)

PROM can be programmed using a special hardware device called PROM programmer or PROM burner.

EPROM (Erasable Programmable ROM)

EPROM can be erased and then programmed using special electrical signals or UV rays. EPROMs that can be erased using UV rays are called UVEPROM and those that can be erased using electrical signals are called EEPROM. However, handling electric signals is easier and safer than UV rays.

Cache Memory

Small piece of high speed volatile memory available to the processor for fast processing is called cache memory. Cache may be a reserved portion of main memory, another chip on CPU or an independent high speed storage device. Cache memory is made of fast speed SRAMs. The process of keeping some data and instructions in cache memory for faster access is called caching. Caching is done when a set of data or instructions is accesses again and again.

Whenever the processor needs any piece of data or instructions, it checks the cache first. If it is unavailable there, then the main memory and finally secondary memory is accessed. As cache has very high speed, time spent in accessing it every time is negligible as compared to time saved if data indeed is in the cache. Finding data or instruction in cache is called cache hit.

Basics of Computers - Secondary Memory

You know that processor memory, also known as primary memory, is expensive as well as limited. The faster primary memory are also volatile. If we need to store large amount of data or programs permanently, we need a cheaper and permanent memory. Such memory is called secondary memory. Here we will discuss secondary memory devices that can be used to store large amount of data, audio, video and multimedia files.

Characteristics of Secondary Memory

These are some characteristics of secondary memory, which distinguish it from primary memory −

  • It is non-volatile, i.e. it retains data when power is switched off
  • It is large capacities to the tune of terabytes
  • It is cheaper as compared to primary memory

Depending on whether secondary memory device is part of CPU or not, there are two types of secondary memory – fixed and removable.

Secondary Memory

Let us look at some of the secondary memory devices available.

Hard Disk Drive

Hard disk drive is made up of a series of circular disks called platters arranged one over the other almost ½ inches apart around a spindle. Disks are made of non-magnetic material like aluminum alloy and coated with 10-20 nm of magnetic material.

Hard Disk Drive

Standard diameter of these disks is 14 inches and they rotate with speeds varying from 4200 rpm (rotations per minute) for personal computers to 15000 rpm for servers. Data is stored by magnetizing or demagnetizing the magnetic coating. A magnetic reader arm is used to read data from and write data to the disks. A typical modern HDD has capacity in terabytes (TB).

CD Drive

CD stands for Compact Disk. CDs are circular disks that use optical rays, usually lasers, to read and write data. They are very cheap as you can get 700 MB of storage space for less than a dollar. CDs are inserted in CD drives built into CPU cabinet. They are portable as you can eject the drive, remove the CD and carry it with you. There are three types of CDs −

  • CD-ROM (Compact Disk – Read Only Memory) − The data on these CDs are recorded by the manufacturer. Proprietary Software, audio or video are released on CD-ROMs.

  • CD-R (Compact Disk – Recordable) − Data can be written by the user once on the CD-R. It cannot be deleted or modified later.

  • CD-RW (Compact Disk – Rewritable) − Data can be written and deleted on these optical disks again and again.

DVD Drive

DVD stands for Digital Video Display. DVD are optical devices that can store 15 times the data held by CDs. They are usually used to store rich multimedia files that need high storage capacity. DVDs also come in three varieties – read only, recordable and rewritable.

DVD Drive

Pen Drive

Pen drive is a portable memory device that uses solid state memory rather than magnetic fields or lasers to record data. It uses a technology similar to RAM, except that it is nonvolatile. It is also called USB drive, key drive or flash memory.

Pen Drive

Blu Ray Disk

Blu Ray Disk (BD) is an optical storage media used to store high definition (HD) video and other multimedia filed. BD uses shorter wavelength laser as compared to CD/DVD. This enables writing arm to focus more tightly on the disk and hence pack in more data. BDs can store up to 128 GB data.

Basics of Computers - Input/Output Ports

A connection point that acts as interface between the computer and external devices like mouse, printer, modem, etc. is called port. Ports are of two types −

  • Internal port − It connects the motherboard to internal devices like hard disk drive, CD drive, internal modem, etc.

  • External port − It connects the motherboard to external devices like modem, mouse, printer, flash drives, etc.

Input Output Ports

Let us look at some of the most commonly used ports.

Serial Port

Serial ports transmit data sequentially one bit at a time. So they need only one wire to transmit 8 bits. However it also makes them slower. Serial ports are usually 9-pin or 25-pin male connectors. They are also known as COM (communication) ports or RS323C ports.

Serial Ports

Parallel Port

Parallel ports can send or receive 8 bits or 1 byte at a time. Parallel ports come in form of 25-pin female pins and are used to connect printer, scanner, external hard disk drive, etc.

Parallel Ports

USB Port

USB stands for Universal Serial Bus. It is the industry standard for short distance digital data connection. USB port is a standardized port to connect a variety of devices like printer, camera, keyboard, speaker, etc.

USB Port

PS-2 Port

PS/2 stands for Personal System/2. It is a female 6-pin port standard that connects to the male mini-DIN cable. PS/2 was introduced by IBM to connect mouse and keyboard to personal computers. This port is now mostly obsolete, though some systems compatible with IBM may have this port.

Infrared Port

Infrared port is a port that enables wireless exchange of data within a radius of 10m. Two devices that have infrared ports are placed facing each other so that beams of infrared lights can be used to share data.

Bluetooth Port

Bluetooth is a telecommunication specification that facilitates wireless connection between phones, computers and other digital devices over short range wireless connection. Bluetooth port enables synchronization between Bluetooth-enabled devices. There are two types of Bluetooth ports −

  • Incoming − It is used to receive connection from Bluetooth devices.

  • Outgoing − It is used to request connection to other Bluetooth devices.

FireWire Port

FireWire is Apple Computer’s interface standard for enabling high speed communication using serial bus. It is also called IEEE 1394 and used mostly for audio and video devices like digital camcorders.

 

INTRODUCTION TO ENVIRONMENT PROTECTION ONLINE TUTORIAL FOR HIGH SCHOOL AND UNIVERSITY LEVEL

 

Meaning

Environmental protection refers to any measure that is taken to conserve, maintain or preserve the state of the environment. Protection of the environment can be done through reducing pollutants or anything that leads to its degradation.

Conservation of the environment aims at keeping it safe and healthy. It aims at the reduction of overusing the natural resources. It is the taking care of all the components that make up the environment.

 

 

Preserving is also used hand in hand with the term conserving. Environmental preservation refers practices that do not alter the environment. It aims at keeping the environment unchanged so as to leave it intact.

Protection of the environment can also mean that the environmental practices that the human race indulge in are sustainable to help avoid damaging or harming the ecosystem.

The animals are also part of the environment and when we talk about protection, animals are a key factor. For animals, it is more about conservation. It has to do with protecting the endangered species from extinction by discouraging activities such as poaching.

When the environment is protected it serves to benefit all those who rely on it. It serves in the maintenance of life for the plants, humans and animals.

Importance of environmental protection.

To protect/save our lives.

The environment supports the life of each and every living thing on earth. We rely on the environment for life. When it is protected we are assured of better health. Food, quality air and so much more. As the late professor Wangari Mathai said and I quote, “if we destroy the environment, the environment will destroy us.” This is so true because it is the environment that sustains our life.

The environment has suffered due to the scientific inventions.

A lot has been discovered over the years. Many of these inventions tend to be harmful to the environment, though it is a way of the human race trying to make their life better. Factories have been built in so many places around the world. The emission of harmful gases into the air is on the increase. The dredging of oil in the sea is also another case. Trees are being cut down to create space for more land. With all this going on, the environment remains at or mercy for protection.

Discharge of carbon gases.

The first thing everyone with a good amount of money is thinking about is how to get a car. The purchase of cars has grown over the years. The worst part is that there are not many cars that are environment friendly. Most of them use fuel which when burnt, releases carbon into the air. Factories are also playing a role in this. Carbon gases are not in any way friendly to the environment and this is why we need to protect it.

Use of low grade plastics.

The chemicals that go into making plastics is highly toxic and it poses serious threats to the environment. Burning of plastics during their production and even after use releases toxic fumes into the air. The toxins can also leak into the soil and ground water causing contamination. This makes it difficult to grow plants and even pose a challenge to hormones in many living things.

Biodiversity is important.

For the environment to be a better place to live, biodiversity has to be a part of it. In science it is said that during the day, plants use carbon dioxide, while humans breathe in oxygen and breathe out carbon dioxide. This is a form of exchange plan. Plants help to reduce carbon dioxide in the air, which in turn benefits humans. Each living has a role to play in the environment. It makes the world a better place to live in.

It is our moral obligation.

We owe our existence to the environment. It is our role to ensure that the environment supports us and other living species in a comfortable way. The only way we can pay it back is by protecting it in all ways possible.

Environmental hazards are dangerous.

When we look at our water bodies, they have become dumping grounds for dangerous chemicals. Most factories throw their waste into the lakes and oceans. These chemicals end up in the food web such as mercury in fish. These foods end up on our plates and the end result could be serious diseases. The air we breathe in and what our skin comes into contact with is crucial. When we have harmful gases in the air it is a threat to life.

How to protect the environment?

The following protective and control measures are aimed at environmental protection.

Reforestation.

 

 

Trees are very important. They play a great role in air purification, they are water catchment areas and a home to many other living species. The more trees we have the purer the air and the less the chances of having water problems. Animals also have their own habitat in the forests.

Use green technologies.

Factories should try to go green. Use more environmentally friendly gases that are not harmful. Wind energy is one way to go about it. Solar energy is also an applicable method. Use of renewable energy will reduce the harmful chemicals.

Use less chemicals in factories.

When the chemicals are less, this means that waste is also minimal. Not only the factories but also those who do farming. Most of the pesticides and fertilizers used in farms end up in water bodies. It is important to use a reasonable amount that will not be harmful.

Share your car or use public transport.

Since cars are a major air pollutant, trying to minimize the number of cars on the road is important. People can choose to use more of public transport. Families can share cars rather than each member of the family having a car or use bicycles where possible.

Create awareness.

It is important to educate people on the importance of environmental protection. The little things people do will do a long way in environmental protection. Encourage them on the importance of reusing and recycling. Encourage use of energy savers as well as collecting rain water. It is also important to educate on animal conservation as well as taking part in tree planting programs. There are so many organizations all over the world that are planting trees in the aim of saving the world.

Be sustainable.

Try as much as possible to reduce or eliminate waste and use the environment is an eco-friendly way. Sustainability means reducing the use of natural resources so as to reduce the chances of depleting them. Use more of fluorescent tubes, use rechargeable batteries, use renewable energy, and use renewable bags for shopping are some of the ways to be sustainable.

Plant trees along sea-beach.

Sea beaches are low-lying lands that are bear. Planting trees in these areas will play a great role in protecting the environment. They can serve as air purifiers and reduce the chances of harmful waste reaching the sea and oceans. They will also serve the purpose of preventing dunes.

Conclusion.

The environment is very important for the existence of life. It supports living things and protecting it should be our first priority. Without a conducive environment, we run the risk of extinction. It is good to understand that it is the environment that supports us and not vise versa.

 

COMPUTER HARDWARE ONLINE TUTORIAL FOR HIGH SCHOOL AND UNIVERSITY LEVEL

 
 Central Processing Unit (CPU)

Intel CPU

The CPU of your computer is very much like your brain, it is the part of the computer that gives out all basic instructions to every other component on your computer. The CPU is one of the main components that will effect the performance of your computer, generally a powerful CPU will let a computer perform tasks faster and can perform more intensive tasks on your computer as well. The two main brands of desktop CPU manufacturers are AMD and Intel, both of which have certain advantages and disadvantages in their hardware. You’ll want to do your own research on what CPU works for you, depending on what your other components are and your budget, each brand will have something it will excel in. Overall, you should come away from this understanding that your CPU will be the part of your computer that tells all your other components what to do and will determine how fast your computer will carry out its tasks.

Motherboard
Motherboard

The motherboard of your PC is your inner body, it connects all the different parts of your PC together, your motherboard is another critical component that may not effect your performance exactly, but it will effect what parts you can use. Every motherboard will list in its specifications what it is compatible with, this isn’t too big of an issue if you are buying a laptop or computer that is already pre-built, however when building a computer this will be extremely important and determine what parts you can use. Your motherboard is the component that will also decide what inputs and outputs your computer has. An example of these inputs and outputs are your audio outputs, video outputs, usb ports, ethernet ports, firewire ports, and your mouse and keyboard. Most motherboards now come with a video card that is very basic, it won’t be useful for high performance gaming and will cause rendering to be slower when video-editing, that will allow you to playback video files and see your computer on a monitor of course.

Random Access Memory (RAM)
RAM

RAM isn’t easy to compare to a part of your body, but better to explain through example. Whenever you open a new program in your computer and it takes a minute to load the program, the computer is accessing your RAM, temporary memory/information in the computer. When you close your program, that data goes away and stops taking up part of your RAM. This is the reason RAM is necessary for a computer, any temporary data that you access will use your RAM. Most programs such as a web browser or word processor will not use a large amount of RAM, however programs like high end games, photo editors, and video editors can use a large amount of RAM in your computer, especially if you’re running multiple applications at once. RAM will come in the form of sticks that you insert into your motherboard. RAM can be upgraded at any time to a desktop assuming it is compatible with your motherboard, however be careful and check to see if your motherboard requires 1,2, or 3 sticks of RAM to run. Overall RAM will effect how quickly programs will run, how quickly they will boot up, and how many can be running at a time making it extremely important to having a faster and more efficient computer.

Hard Disk Drive/Solid State Drive (HDD/SSD)
Hard Drive

The next two items share the same function but are built differently. A Hard Drive uses a disk and magnets to write data on to the disk that will permanently store information, assuming the disk itself does not get damaged for other reasons.  Hard Drives are older compared to Solid State Drives and are significantly cheaper than SSDs. Solid State Drives work off of flash memory, unlike a hard drive they have no moving parts and everything works electronically. Examples of devices that work on flash memory that you are possibly familiar with are your usb storage devices/ “flash drives”, video game memory cards, or an SD card that most cameras use. SSDs read data much faster but due to the technology being newer are more expensive. If you’re building a computer that will not require a lot of storage you it would be prefferable to buy an SSD, however Hard Drives are much less expensive if you plan on storing more than 250Gigabytes of information.

Now that we’ve talked about the differences between these two items, let me bring the focus back on the main purpose of them. Both of these devices are used to store information, your photos, word documents, videos, etc. that you save to your computer, all the files that appear every time you turn on computer. Hard drives come in the storage sizes of “Gigabytes”, 1024Megabytes. An example of how large a file can be, a 1080p high quality movie that is around 2-3 hours long can be 3-6Gigabytes of data. Larger storage devices would be needed for someone who works with a lot of video, gaming, or possibly sound editing. Basic word documents, power points, and images are immensely smaller than video.

If you are using a desktop you can always add more hard drives or Solid State Drives to your computer, they connected to the motherboard on the inside of the computer. Your operating system, which I will get into more detail later, is installed into your hard drive and is necessary to run a computer. If you’re using a laptop, your hard drive will be installed into the laptop from the start and there will be no way to add any new hard drives as there’s no space inside the laptop. Should want to add more storage to your laptop, you can buy external storage devices that connect via usb or firewire to your laptop and will store your data.

Power Supply Unit (PSU)
PSU

With all the electronics we’ve discussed, obviously they need power in order to function and this is where your power supply comes into play. Your power supply is exactly what it sounds like, it is the part of the computer that supplies power to all your components, converting the energy from your wall socket into energy for the computer to use. Something to keep in mind about a power supply is that more wattage is NOT always better, depending on how powerful your computer components are you may not need a large wattage. There are many calculators online such as this one
http://www.coolermaster.outervision.com/
that will give you an idea on how many watts you need for your PC. It is best to do research on a power supply yourself if you are building a PC to see what you need for your parts. A pre-built PC and laptop will come with a power supply installed of course that will supply power to the computer, so you won’t have to worry about choosing one if you already have your PC built, however it is useful to know incase you ever upgrade to a more energy efficient supply. The power supply is located typically on the top or bottom of the computer case exposed in the back so you can connect it to a wall socket. Your power supply will not effect your computer’s performance, but is still necessary in order for the computer to run in the first place

Graphics Processing Unit (GPU)
GPU

Your Graphics Processing Unit or GPU is a key component to your computer, it is the component that outputs all your visuals on the computer and let’s you playback video. Most motherboards today come with a video card on the motherboard itself that will playback video and allow for basic video editing. For casual computer use, a pre-installed graphics card on the motherboard is perfectly fine. For a PC dedicated for video editing however, a graphics card may be very helpful as it will allow you to render video faster and for gaming it is necessary to run high end games. If you choose to upgrade your computer by buying a graphics card, there is a slot on the motherboard for you to attach it to, so you can upgrade your computer’s graphics anytime. A laptop however as we’ve examined before cannot be upgraded and you will have to look for a laptop with a more advanced video card if you’re looking to use it for gaming or faster video editing.

Computer Case/Tower
Tower

This may seem like a no-brainer to some, but your computer case or a “Tower” is extremely important for your PC. With all the fragile electronics in your PC, you need something to protect it and this is where your tower comes into play. Your tower will keep all your components protected, in place, and provide proper ventilation. The enemy of electronics is heat, electronics get very hot very quickly, without proper ventilation you would easily overheat your computer and it would become useless. Your tower will have open ventilation shafts for your computer and fans to keep your electronics nice and cool. A pre-built computer will always come with a tower, however you can choose to upgrade and buy a new tower if you are unsatisfied with your old ones. A more expensive tower may come with additional usb ports in the front, better ventilation, larger (and quieter) fans, and more dust protection. Laptops are in manufactured cases and as we’ve said before, are difficult to upgrade, however consider the size and weight of your laptop case as well when you purchase it. Your laptop, unlike a desktop PC, will have the monitor as part of the case, and screen size may be important if you’re in a field that requires video editing or even for casual use.

Computer Monitor
Monitor

Your computer monitor applies to a desktop PC, as said before your screen size/monitor is attached to your case for a laptop. A computer monitor is simply the screen that will be giving you your video output from the computer. A monitor’s screen size and features may be important to you as some monitors will have more potential video inputs such as HDMI or VGA, so make sure your monitor has the appropriate video components for your graphics card. There is not much more detail to go into for a monitor, some monitors come with sound for your computer, but you may prefer getting speakers instead. Most laptops have a small output if you would like to put your laptop on a larger monitor as well, of course this is not exactly necessary for most laptop users as a monitor is heavy, runs off  power from a wall socket, and you won’t be carrying it around.

Speakers

I’ll take this small section to discuss speakers, they are rather self explanatory. Some motherboards and cases will have sound so you will not need speakers, however most people will prefer the sound quality of speakers instead. Speakers will connect to the back of your computer and will of course play audio. Laptops will have outputs for microphones, speakers, and headphones usually, but otherwise come with their own internal speakers. There are many many speakers that would take too long to talk about here, so do your own research when buying speakers and see which features appeal to you the most.

Optical Drive/Blu-ray Drive
optical-drives

This is the final component needed for your computer, an optical drive or a CD/DVD drive. Blu-Ray drives also read CDs/DVDs as well as Blu-Ray disks, but may not WRITE CD or DVD formats. Either one will be fine however, as what you need is something to read the disk to install your operating system. An operating system delves more into software, but it is simply the software of your computer that manages other software and your hardware devices. Examples of different operating systems are Mac OSX, Windows XP, Windows 7, and Linux. When buying a pre-built PC, it will typically come with an operating system already installed and an optical drive, when building a PC however, you will need to purchase this yourself. It is entirely possible to install an operating system off of a flash drive as well, but it is typically handy to have an optical drive incase you install any other data or programs via CD. You can always add an optical drive to your desktop computer as well, should you find the need for one later on.

 OPERATING SYSTEMS ONLINE TUTORIAL FOR HIGH SCHOOL AND UNIVERSITY LEVEL

 

 An Operating System (OS) is an interface between a computer user and computer hardware. An operating system is a software which performs all the basic tasks like file management, memory management, process management, handling input and output, and controlling peripheral devices such as disk drives and printers.

Some popular Operating Systems include Linux Operating System, Windows Operating System, VMS, OS/400, AIX, z/OS, etc.

Definition

An operating system is a program that acts as an interface between the user and the computer hardware and controls the execution of all kinds of programs.

Conceptual view of an Operating System

Following are some of important functions of an operating System.

  • Memory Management
  • Processor Management
  • Device Management
  • File Management
  • Security
  • Control over system performance
  • Job accounting
  • Error detecting aids
  • Coordination between other software and users

Memory Management

Memory management refers to management of Primary Memory or Main Memory. Main memory is a large array of words or bytes where each word or byte has its own address.

Main memory provides a fast storage that can be accessed directly by the CPU. For a program to be executed, it must in the main memory. An Operating System does the following activities for memory management −

  • Keeps tracks of primary memory, i.e., what part of it are in use by whom, what part are not in use.

  • In multiprogramming, the OS decides which process will get memory when and how much.

  • Allocates the memory when a process requests it to do so.

  • De-allocates the memory when a process no longer needs it or has been terminated.

Processor Management

In multiprogramming environment, the OS decides which process gets the processor when and for how much time. This function is called process scheduling. An Operating System does the following activities for processor management −

  • Keeps tracks of processor and status of process. The program responsible for this task is known as traffic controller.

  • Allocates the processor (CPU) to a process.

  • De-allocates processor when a process is no longer required.

Device Management

An Operating System manages device communication via their respective drivers. It does the following activities for device management −

  • Keeps tracks of all devices. Program responsible for this task is known as the I/O controller.

  • Decides which process gets the device when and for how much time.

  • Allocates the device in the efficient way.

  • De-allocates devices.

File Management

A file system is normally organized into directories for easy navigation and usage. These directories may contain files and other directions.

An Operating System does the following activities for file management −

  • Keeps track of information, location, uses, status etc. The collective facilities are often known as file system.

  • Decides who gets the resources.

  • Allocates the resources.

  • De-allocates the resources.

Other Important Activities

Following are some of the important activities that an Operating System performs −

  • Security − By means of password and similar other techniques, it prevents unauthorized access to programs and data.

  • Control over system performance − Recording delays between request for a service and response from the system.

  • Job accounting − Keeping track of time and resources used by various jobs and users.

  • Error detecting aids − Production of dumps, traces, error messages, and other debugging and error detecting aids.

  • Coordination between other softwares and users − Coordination and assignment of compilers, interpreters, assemblers and other software to the various users of the computer systems.

 
TYPES OF OPERATING SYSTEMS 

 

Operating systems are there from the very first computer generation and they keep evolving with time. In this chapter, we will discuss some of the important types of operating systems which are most commonly used.

Batch operating system

The users of a batch operating system do not interact with the computer directly. Each user prepares his job on an off-line device like punch cards and submits it to the computer operator. To speed up processing, jobs with similar needs are batched together and run as a group. The programmers leave their programs with the operator and the operator then sorts the programs with similar requirements into batches.

The problems with Batch Systems are as follows −

  • Lack of interaction between the user and the job.
  • CPU is often idle, because the speed of the mechanical I/O devices is slower than the CPU.
  • Difficult to provide the desired priority.

Time-sharing operating systems

Time-sharing is a technique which enables many people, located at various terminals, to use a particular computer system at the same time. Time-sharing or multitasking is a logical extension of multiprogramming. Processor's time which is shared among multiple users simultaneously is termed as time-sharing.

The main difference between Multiprogrammed Batch Systems and Time-Sharing Systems is that in case of Multiprogrammed batch systems, the objective is to maximize processor use, whereas in Time-Sharing Systems, the objective is to minimize response time.

Multiple jobs are executed by the CPU by switching between them, but the switches occur so frequently. Thus, the user can receive an immediate response. For example, in a transaction processing, the processor executes each user program in a short burst or quantum of computation. That is, if n users are present, then each user can get a time quantum. When the user submits the command, the response time is in few seconds at most.

The operating system uses CPU scheduling and multiprogramming to provide each user with a small portion of a time. Computer systems that were designed primarily as batch systems have been modified to time-sharing systems.

Advantages of Timesharing operating systems are as follows −

  • Provides the advantage of quick response.
  • Avoids duplication of software.
  • Reduces CPU idle time.

Disadvantages of Time-sharing operating systems are as follows −

  • Problem of reliability.
  • Question of security and integrity of user programs and data.
  • Problem of data communication.

Distributed operating System

Distributed systems use multiple central processors to serve multiple real-time applications and multiple users. Data processing jobs are distributed among the processors accordingly.

The processors communicate with one another through various communication lines (such as high-speed buses or telephone lines). These are referred as loosely coupled systems or distributed systems. Processors in a distributed system may vary in size and function. These processors are referred as sites, nodes, computers, and so on.

The advantages of distributed systems are as follows −

  • With resource sharing facility, a user at one site may be able to use the resources available at another.
  • Speedup the exchange of data with one another via electronic mail.
  • If one site fails in a distributed system, the remaining sites can potentially continue operating.
  • Better service to the customers.
  • Reduction of the load on the host computer.
  • Reduction of delays in data processing.

Network operating System

A Network Operating System runs on a server and provides the server the capability to manage data, users, groups, security, applications, and other networking functions. The primary purpose of the network operating system is to allow shared file and printer access among multiple computers in a network, typically a local area network (LAN), a private network or to other networks.

Examples of network operating systems include Microsoft Windows Server 2003, Microsoft Windows Server 2008, UNIX, Linux, Mac OS X, Novell NetWare, and BSD.

The advantages of network operating systems are as follows −

  • Centralized servers are highly stable.
  • Security is server managed.
  • Upgrades to new technologies and hardware can be easily integrated into the system.
  • Remote access to servers is possible from different locations and types of systems.

The disadvantages of network operating systems are as follows −

  • High cost of buying and running a server.
  • Dependency on a central location for most operations.
  • Regular maintenance and updates are required.

Real Time operating System

A real-time system is defined as a data processing system in which the time interval required to process and respond to inputs is so small that it controls the environment. The time taken by the system to respond to an input and display of required updated information is termed as the response time. So in this method, the response time is very less as compared to online processing.

Real-time systems are used when there are rigid time requirements on the operation of a processor or the flow of data and real-time systems can be used as a control device in a dedicated application. A real-time operating system must have well-defined, fixed time constraints, otherwise the system will fail. For example, Scientific experiments, medical imaging systems, industrial control systems, weapon systems, robots, air traffic control systems, etc.

There are two types of real-time operating systems.

Hard real-time systems

Hard real-time systems guarantee that critical tasks complete on time. In hard real-time systems, secondary storage is limited or missing and the data is stored in ROM. In these systems, virtual memory is almost never found.

Soft real-time systems

Soft real-time systems are less restrictive. A critical real-time task gets priority over other tasks and retains the priority until it completes. Soft real-time systems have limited utility than hard real-time systems. For example, multimedia, virtual reality, Advanced Scientific Projects like undersea exploration and planetary rovers, etc. 

 

OPERATING SYSTEM 

SERVICES  

 

 An Operating System provides services to both the users and to the programs.

  • It provides programs an environment to execute.
  • It provides users the services to execute the programs in a convenient manner.

Following are a few common services provided by an operating system −

  • Program execution
  • I/O operations
  • File System manipulation
  • Communication
  • Error Detection
  • Resource Allocation
  • Protection

Program execution

Operating systems handle many kinds of activities from user programs to system programs like printer spooler, name servers, file server, etc. Each of these activities is encapsulated as a process.

A process includes the complete execution context (code to execute, data to manipulate, registers, OS resources in use). Following are the major activities of an operating system with respect to program management −

  • Loads a program into memory.
  • Executes the program.
  • Handles program's execution.
  • Provides a mechanism for process synchronization.
  • Provides a mechanism for process communication.
  • Provides a mechanism for deadlock handling.

I/O Operation

An I/O subsystem comprises of I/O devices and their corresponding driver software. Drivers hide the peculiarities of specific hardware devices from the users.

An Operating System manages the communication between user and device drivers.

  • I/O operation means read or write operation with any file or any specific I/O device.
  • Operating system provides the access to the required I/O device when required.

File system manipulation

A file represents a collection of related information. Computers can store files on the disk (secondary storage), for long-term storage purpose. Examples of storage media include magnetic tape, magnetic disk and optical disk drives like CD, DVD. Each of these media has its own properties like speed, capacity, data transfer rate and data access methods.

A file system is normally organized into directories for easy navigation and usage. These directories may contain files and other directions. Following are the major activities of an operating system with respect to file management −

  • Program needs to read a file or write a file.
  • The operating system gives the permission to the program for operation on file.
  • Permission varies from read-only, read-write, denied and so on.
  • Operating System provides an interface to the user to create/delete files.
  • Operating System provides an interface to the user to create/delete directories.
  • Operating System provides an interface to create the backup of file system.

Communication

In case of distributed systems which are a collection of processors that do not share memory, peripheral devices, or a clock, the operating system manages communications between all the processes. Multiple processes communicate with one another through communication lines in the network.

The OS handles routing and connection strategies, and the problems of contention and security. Following are the major activities of an operating system with respect to communication −

  • Two processes often require data to be transferred between them
  • Both the processes can be on one computer or on different computers, but are connected through a computer network.
  • Communication may be implemented by two methods, either by Shared Memory or by Message Passing.

Error handling

Errors can occur anytime and anywhere. An error may occur in CPU, in I/O devices or in the memory hardware. Following are the major activities of an operating system with respect to error handling −

  • The OS constantly checks for possible errors.
  • The OS takes an appropriate action to ensure correct and consistent computing.

Resource Management

In case of multi-user or multi-tasking environment, resources such as main memory, CPU cycles and files storage are to be allocated to each user or job. Following are the major activities of an operating system with respect to resource management −

  • The OS manages all kinds of resources using schedulers.
  • CPU scheduling algorithms are used for better utilization of CPU.

Protection

Considering a computer system having multiple users and concurrent execution of multiple processes, the various processes must be protected from each other's activities.

Protection refers to a mechanism or a way to control the access of programs, processes, or users to the resources defined by a computer system. Following are the major activities of an operating system with respect to protection −

  • The OS ensures that all access to system resources is controlled.
  • The OS ensures that external I/O devices are protected from invalid access attempts.
  • The OS provides authentication features for each user by means of passwords.
 
 

 OPERATING SYSTEM PROPERTIES

 

Batch processing

Batch processing is a technique in which an Operating System collects the programs and data together in a batch before processing starts. An operating system does the following activities related to batch processing −

  • The OS defines a job which has predefined sequence of commands, programs and data as a single unit.

  • The OS keeps a number a jobs in memory and executes them without any manual information.

  • Jobs are processed in the order of submission, i.e., first come first served fashion.

  • When a job completes its execution, its memory is released and the output for the job gets copied into an output spool for later printing or processing.

Batch Processing

Advantages

  • Batch processing takes much of the work of the operator to the computer.

  • Increased performance as a new job get started as soon as the previous job is finished, without any manual intervention.

Disadvantages

  • Difficult to debug program.
  • A job could enter an infinite loop.
  • Due to lack of protection scheme, one batch job can affect pending jobs.

Multitasking

Multitasking is when multiple jobs are executed by the CPU simultaneously by switching between them. Switches occur so frequently that the users may interact with each program while it is running. An OS does the following activities related to multitasking −

  • The user gives instructions to the operating system or to a program directly, and receives an immediate response.

  • The OS handles multitasking in the way that it can handle multiple operations/executes multiple programs at a time.

  • Multitasking Operating Systems are also known as Time-sharing systems.

  • These Operating Systems were developed to provide interactive use of a computer system at a reasonable cost.

  • A time-shared operating system uses the concept of CPU scheduling and multiprogramming to provide each user with a small portion of a time-shared CPU.

  • Each user has at least one separate program in memory.

Multitasking
  • A program that is loaded into memory and is executing is commonly referred to as a process.

  • When a process executes, it typically executes for only a very short time before it either finishes or needs to perform I/O.

  • Since interactive I/O typically runs at slower speeds, it may take a long time to complete. During this time, a CPU can be utilized by another process.

  • The operating system allows the users to share the computer simultaneously. Since each action or command in a time-shared system tends to be short, only a little CPU time is needed for each user.

  • As the system switches CPU rapidly from one user/program to the next, each user is given the impression that he/she has his/her own CPU, whereas actually one CPU is being shared among many users.

Multiprogramming

Sharing the processor, when two or more programs reside in memory at the same time, is referred as multiprogramming. Multiprogramming assumes a single shared processor. Multiprogramming increases CPU utilization by organizing jobs so that the CPU always has one to execute.

The following figure shows the memory layout for a multiprogramming system.

Memory layout

An OS does the following activities related to multiprogramming.

  • The operating system keeps several jobs in memory at a time.

  • This set of jobs is a subset of the jobs kept in the job pool.

  • The operating system picks and begins to execute one of the jobs in the memory.

  • Multiprogramming operating systems monitor the state of all active programs and system resources using memory management programs to ensures that the CPU is never idle, unless there are no jobs to process.

Advantages

  • High and efficient CPU utilization.
  • User feels that many programs are allotted CPU almost simultaneously.

Disadvantages

  • CPU scheduling is required.
  • To accommodate many jobs in memory, memory management is required.

Interactivity

Interactivity refers to the ability of users to interact with a computer system. An Operating system does the following activities related to interactivity −

  • Provides the user an interface to interact with the system.
  • Manages input devices to take inputs from the user. For example, keyboard.
  • Manages output devices to show outputs to the user. For example, Monitor.

The response time of the OS needs to be short, since the user submits and waits for the result.

Real Time System

Real-time systems are usually dedicated, embedded systems. An operating system does the following activities related to real-time system activity.

  • In such systems, Operating Systems typically read from and react to sensor data.
  • The Operating system must guarantee response to events within fixed periods of time to ensure correct performance.

Distributed Environment

A distributed environment refers to multiple independent CPUs or processors in a computer system. An operating system does the following activities related to distributed environment −

  • The OS distributes computation logics among several physical processors.

  • The processors do not share memory or a clock. Instead, each processor has its own local memory.

  • The OS manages the communications between the processors. They communicate with each other through various communication lines.

Spooling

Spooling is an acronym for simultaneous peripheral operations on line. Spooling refers to putting data of various I/O jobs in a buffer. This buffer is a special area in memory or hard disk which is accessible to I/O devices.

An operating system does the following activities related to distributed environment −

  • Handles I/O device data spooling as devices have different data access rates.

  • Maintains the spooling buffer which provides a waiting station where data can rest while the slower device catches up.

  • Maintains parallel computation because of spooling process as a computer can perform I/O in parallel fashion. It becomes possible to have the computer read data from a tape, write data to disk and to write out to a tape printer while it is doing its computing task.

Spooling

Advantages

  • The spooling operation uses a disk as a very large buffer.
  • Spooling is capable of overlapping I/O operation for one job with processor operations for another job.
 
 

OPERATING SYSTEM - PROCESSES 

 

Process

A process is basically a program in execution. The execution of a process must progress in a sequential fashion.

A process is defined as an entity which represents the basic unit of work to be implemented in the system.

To put it in simple terms, we write our computer programs in a text file and when we execute this program, it becomes a process which performs all the tasks mentioned in the program.

When a program is loaded into the memory and it becomes a process, it can be divided into four sections ─ stack, heap, text and data. The following image shows a simplified layout of a process inside main memory −

Process Components
S.N.Component & Description
1

Stack

The process Stack contains the temporary data such as method/function parameters, return address and local variables.

2

Heap

This is dynamically allocated memory to a process during its run time.

3

Text

This includes the current activity represented by the value of Program Counter and the contents of the processor's registers.

4

Data

This section contains the global and static variables.

Program

A program is a piece of code which may be a single line or millions of lines. A computer program is usually written by a computer programmer in a programming language. For example, here is a simple program written in C programming language −

#include <stdio.h>  int main() {    printf("Hello, World! \n");    return 0; }

A computer program is a collection of instructions that performs a specific task when executed by a computer. When we compare a program with a process, we can conclude that a process is a dynamic instance of a computer program.

A part of a computer program that performs a well-defined task is known as an algorithm. A collection of computer programs, libraries and related data are referred to as a software.

Process Life Cycle

When a process executes, it passes through different states. These stages may differ in different operating systems, and the names of these states are also not standardized.

In general, a process can have one of the following five states at a time.

S.N.State & Description
1

Start

This is the initial state when a process is first started/created.

2

Ready

The process is waiting to be assigned to a processor. Ready processes are waiting to have the processor allocated to them by the operating system so that they can run. Process may come into this state after Start state or while running it by but interrupted by the scheduler to assign CPU to some other process.

3

Running

Once the process has been assigned to a processor by the OS scheduler, the process state is set to running and the processor executes its instructions.

4

Waiting

Process moves into the waiting state if it needs to wait for a resource, such as waiting for user input, or waiting for a file to become available.

5

Terminated or Exit

Once the process finishes its execution, or it is terminated by the operating system, it is moved to the terminated state where it waits to be removed from main memory.

Process States

Process Control Block (PCB)

A Process Control Block is a data structure maintained by the Operating System for every process. The PCB is identified by an integer process ID (PID). A PCB keeps all the information needed to keep track of a process as listed below in the table −

S.N.Information & Description
1

Process State

The current state of the process i.e., whether it is ready, running, waiting, or whatever.

2

Process privileges

This is required to allow/disallow access to system resources.

3

Process ID

Unique identification for each of the process in the operating system.

4

Pointer

A pointer to parent process.

5

Program Counter

Program Counter is a pointer to the address of the next instruction to be executed for this process.

6

CPU registers

Various CPU registers where process need to be stored for execution for running state.

7

CPU Scheduling Information

Process priority and other scheduling information which is required to schedule the process.

8

Memory management information

This includes the information of page table, memory limits, Segment table depending on memory used by the operating system.

9

Accounting information

This includes the amount of CPU used for process execution, time limits, execution ID etc.

10

IO status information

This includes a list of I/O devices allocated to the process.

The architecture of a PCB is completely dependent on Operating System and may contain different information in different operating systems. Here is a simplified diagram of a PCB −

Process Control Block


The PCB is maintained for a process throughout its lifetime, and is deleted once the process terminates. 

 

 OPERATING SYSTEMS PROCESSING SCHEDULES 

 

Definition

The process scheduling is the activity of the process manager that handles the removal of the running process from the CPU and the selection of another process on the basis of a particular strategy.

Process scheduling is an essential part of a Multiprogramming operating systems. Such operating systems allow more than one process to be loaded into the executable memory at a time and the loaded process shares the CPU using time multiplexing.

Process Scheduling Queues

The OS maintains all PCBs in Process Scheduling Queues. The OS maintains a separate queue for each of the process states and PCBs of all processes in the same execution state are placed in the same queue. When the state of a process is changed, its PCB is unlinked from its current queue and moved to its new state queue.

The Operating System maintains the following important process scheduling queues −

  • Job queue − This queue keeps all the processes in the system.

  • Ready queue − This queue keeps a set of all processes residing in main memory, ready and waiting to execute. A new process is always put in this queue.

  • Device queues − The processes which are blocked due to unavailability of an I/O device constitute this queue.

Process Scheduling Queuing

The OS can use different policies to manage each queue (FIFO, Round Robin, Priority, etc.). The OS scheduler determines how to move processes between the ready and run queues which can only have one entry per processor core on the system; in the above diagram, it has been merged with the CPU.

Two-State Process Model

Two-state process model refers to running and non-running states which are described below −

S.N.State & Description
1

Running

When a new process is created, it enters into the system as in the running state.

2

Not Running

Processes that are not running are kept in queue, waiting for their turn to execute. Each entry in the queue is a pointer to a particular process. Queue is implemented by using linked list. Use of dispatcher is as follows. When a process is interrupted, that process is transferred in the waiting queue. If the process has completed or aborted, the process is discarded. In either case, the dispatcher then selects a process from the queue to execute.

Schedulers

Schedulers are special system software which handle process scheduling in various ways. Their main task is to select the jobs to be submitted into the system and to decide which process to run. Schedulers are of three types −

  • Long-Term Scheduler
  • Short-Term Scheduler
  • Medium-Term Scheduler

Long Term Scheduler

It is also called a job scheduler. A long-term scheduler determines which programs are admitted to the system for processing. It selects processes from the queue and loads them into memory for execution. Process loads into the memory for CPU scheduling.

The primary objective of the job scheduler is to provide a balanced mix of jobs, such as I/O bound and processor bound. It also controls the degree of multiprogramming. If the degree of multiprogramming is stable, then the average rate of process creation must be equal to the average departure rate of processes leaving the system.

On some systems, the long-term scheduler may not be available or minimal. Time-sharing operating systems have no long term scheduler. When a process changes the state from new to ready, then there is use of long-term scheduler.

Short Term Scheduler

It is also called as CPU scheduler. Its main objective is to increase system performance in accordance with the chosen set of criteria. It is the change of ready state to running state of the process. CPU scheduler selects a process among the processes that are ready to execute and allocates CPU to one of them.

Short-term schedulers, also known as dispatchers, make the decision of which process to execute next. Short-term schedulers are faster than long-term schedulers.

Medium Term Scheduler

Medium-term scheduling is a part of swapping. It removes the processes from the memory. It reduces the degree of multiprogramming. The medium-term scheduler is in-charge of handling the swapped out-processes.

A running process may become suspended if it makes an I/O request. A suspended processes cannot make any progress towards completion. In this condition, to remove the process from memory and make space for other processes, the suspended process is moved to the secondary storage. This process is called swapping, and the process is said to be swapped out or rolled out. Swapping may be necessary to improve the process mix.

Comparison among Scheduler

S.N.Long-Term SchedulerShort-Term SchedulerMedium-Term Scheduler
1It is a job schedulerIt is a CPU schedulerIt is a process swapping scheduler.
2Speed is lesser than short term schedulerSpeed is fastest among other twoSpeed is in between both short and long term scheduler.
3It controls the degree of multiprogrammingIt provides lesser control over degree of multiprogrammingIt reduces the degree of multiprogramming.
4It is almost absent or minimal in time sharing systemIt is also minimal in time sharing systemIt is a part of Time sharing systems.
5It selects processes from pool and loads them into memory for executionIt selects those processes which are ready to executeIt can re-introduce the process into memory and execution can be continued.

Context Switch

A context switch is the mechanism to store and restore the state or context of a CPU in Process Control block so that a process execution can be resumed from the same point at a later time. Using this technique, a context switcher enables multiple processes to share a single CPU. Context switching is an essential part of a multitasking operating system features.

When the scheduler switches the CPU from executing one process to execute another, the state from the current running process is stored into the process control block. After this, the state for the process to run next is loaded from its own PCB and used to set the PC, registers, etc. At that point, the second process can start executing.

Process Context Switch

Context switches are computationally intensive since register and memory state must be saved and restored. To avoid the amount of context switching time, some hardware systems employ two or more sets of processor registers. When the process is switched, the following information is stored for later use.

  • Program Counter
  • Scheduling information
  • Base and limit register value
  • Currently used register
  • Changed State
  • I/O State information
  • Accounting information

 

OPERATING SYSTEM SCHEDULING ALGORITHM  

A Process Scheduler schedules different processes to be assigned to the CPU based on particular scheduling algorithms. There are six popular process scheduling algorithms which we are going to discuss in this chapter −

  • First-Come, First-Served (FCFS) Scheduling
  • Shortest-Job-Next (SJN) Scheduling
  • Priority Scheduling
  • Shortest Remaining Time
  • Round Robin(RR) Scheduling
  • Multiple-Level Queues Scheduling

These algorithms are either non-preemptive or preemptive. Non-preemptive algorithms are designed so that once a process enters the running state, it cannot be preempted until it completes its allotted time, whereas the preemptive scheduling is based on priority where a scheduler may preempt a low priority running process anytime when a high priority process enters into a ready state.

First Come First Serve (FCFS)

  • Jobs are executed on first come, first serve basis.
  • It is a non-preemptive, pre-emptive scheduling algorithm.
  • Easy to understand and implement.
  • Its implementation is based on FIFO queue.
  • Poor in performance as average wait time is high.
First Come First Serve Scheduling Algorithm

Wait time of each process is as follows −

ProcessWait Time : Service Time - Arrival Time
P00 - 0 = 0
P15 - 1 = 4
P28 - 2 = 6
P316 - 3 = 13

Average Wait Time: (0+4+6+13) / 4 = 5.75

Shortest Job Next (SJN)

  • This is also known as shortest job first, or SJF

  • This is a non-preemptive, pre-emptive scheduling algorithm.

  • Best approach to minimize waiting time.

  • Easy to implement in Batch systems where required CPU time is known in advance.

  • Impossible to implement in interactive systems where required CPU time is not known.

  • The processer should know in advance how much time process will take.

Given: Table of processes, and their Arrival time, Execution time

ProcessArrival TimeExecution TimeService Time
P0050
P1135
P22814
P3368
Shortest Job First Scheduling Algorithm

Waiting time of each process is as follows −

ProcessWaiting Time
P00 - 0 = 0
P15 - 1 = 4
P214 - 2 = 12
P38 - 3 = 5

Average Wait Time: (0 + 4 + 12 + 5)/4 = 21 / 4 = 5.25

Priority Based Scheduling

  • Priority scheduling is a non-preemptive algorithm and one of the most common scheduling algorithms in batch systems.

  • Each process is assigned a priority. Process with highest priority is to be executed first and so on.

  • Processes with same priority are executed on first come first served basis.

  • Priority can be decided based on memory requirements, time requirements or any other resource requirement.

Given: Table of processes, and their Arrival time, Execution time, and priority. Here we are considering 1 is the lowest priority.

ProcessArrival TimeExecution TimePriorityService Time
P00510
P113211
P228114
P33635
Priority Scheduling Algorithm

Waiting time of each process is as follows −

ProcessWaiting Time
P00 - 0 = 0
P111 - 1 = 10
P214 - 2 = 12
P35 - 3 = 2

Average Wait Time: (0 + 10 + 12 + 2)/4 = 24 / 4 = 6

Shortest Remaining Time

  • Shortest remaining time (SRT) is the preemptive version of the SJN algorithm.

  • The processor is allocated to the job closest to completion but it can be preempted by a newer ready job with shorter time to completion.

  • Impossible to implement in interactive systems where required CPU time is not known.

  • It is often used in batch environments where short jobs need to give preference.

Round Robin Scheduling

  • Round Robin is the preemptive process scheduling algorithm.

  • Each process is provided a fix time to execute, it is called a quantum.

  • Once a process is executed for a given time period, it is preempted and other process executes for a given time period.

  • Context switching is used to save states of preempted processes.

Round Robin Scheduling Algorithm

Wait time of each process is as follows −

ProcessWait Time : Service Time - Arrival Time
P0(0 - 0) + (12 - 3) = 9
P1(3 - 1) = 2
P2(6 - 2) + (14 - 9) + (20 - 17) = 12
P3(9 - 3) + (17 - 12) = 11

Average Wait Time: (9+2+12+11) / 4 = 8.5

Multiple-Level Queues Scheduling

Multiple-level queues are not an independent scheduling algorithm. They make use of other existing algorithms to group and schedule jobs with common characteristics.

  • Multiple queues are maintained for processes with common characteristics.
  • Each queue can have its own scheduling algorithms.
  • Priorities are assigned to each queue.

For example, CPU-bound jobs can be scheduled in one queue and all I/O-bound jobs in another queue. The Process Scheduler then alternately selects jobs from each queue and assigns them to the CPU based on the algorithm assigned to the queue. 

 

 OPERATING SYSTEM- MULTI THREADING 

 

 

What is Thread?

A thread is a flow of execution through the process code, with its own program counter that keeps track of which instruction to execute next, system registers which hold its current working variables, and a stack which contains the execution history.

A thread shares with its peer threads few information like code segment, data segment and open files. When one thread alters a code segment memory item, all other threads see that.

A thread is also called a lightweight process. Threads provide a way to improve application performance through parallelism. Threads represent a software approach to improving performance of operating system by reducing the overhead thread is equivalent to a classical process.

Each thread belongs to exactly one process and no thread can exist outside a process. Each thread represents a separate flow of control. Threads have been successfully used in implementing network servers and web server. They also provide a suitable foundation for parallel execution of applications on shared memory multiprocessors. The following figure shows the working of a single-threaded and a multithreaded process.

Single vs Multithreaded Process

Difference between Process and Thread

S.N.ProcessThread
1Process is heavy weight or resource intensive.Thread is light weight, taking lesser resources than a process.
2Process switching needs interaction with operating system.Thread switching does not need to interact with operating system.
3In multiple processing environments, each process executes the same code but has its own memory and file resources.All threads can share same set of open files, child processes.
4If one process is blocked, then no other process can execute until the first process is unblocked.While one thread is blocked and waiting, a second thread in the same task can run.
5Multiple processes without using threads use more resources.Multiple threaded processes use fewer resources.
6In multiple processes each process operates independently of the others.One thread can read, write or change another thread's data.

Advantages of Thread

  • Threads minimize the context switching time.
  • Use of threads provides concurrency within a process.
  • Efficient communication.
  • It is more economical to create and context switch threads.
  • Threads allow utilization of multiprocessor architectures to a greater scale and efficiency.

Types of Thread

Threads are implemented in following two ways −

  • User Level Threads − User managed threads.

  • Kernel Level Threads − Operating System managed threads acting on kernel, an operating system core.

User Level Threads

In this case, the thread management kernel is not aware of the existence of threads. The thread library contains code for creating and destroying threads, for passing message and data between threads, for scheduling thread execution and for saving and restoring thread contexts. The application starts with a single thread.

User level thread

Advantages

  • Thread switching does not require Kernel mode privileges.
  • User level thread can run on any operating system.
  • Scheduling can be application specific in the user level thread.
  • User level threads are fast to create and manage.

Disadvantages

  • In a typical operating system, most system calls are blocking.
  • Multithreaded application cannot take advantage of multiprocessing.

Kernel Level Threads

In this case, thread management is done by the Kernel. There is no thread management code in the application area. Kernel threads are supported directly by the operating system. Any application can be programmed to be multithreaded. All of the threads within an application are supported within a single process.

The Kernel maintains context information for the process as a whole and for individuals threads within the process. Scheduling by the Kernel is done on a thread basis. The Kernel performs thread creation, scheduling and management in Kernel space. Kernel threads are generally slower to create and manage than the user threads.

Advantages

  • Kernel can simultaneously schedule multiple threads from the same process on multiple processes.
  • If one thread in a process is blocked, the Kernel can schedule another thread of the same process.
  • Kernel routines themselves can be multithreaded.

Disadvantages

  • Kernel threads are generally slower to create and manage than the user threads.
  • Transfer of control from one thread to another within the same process requires a mode switch to the Kernel.

Multithreading Models

Some operating system provide a combined user level thread and Kernel level thread facility. Solaris is a good example of this combined approach. In a combined system, multiple threads within the same application can run in parallel on multiple processors and a blocking system call need not block the entire process. Multithreading models are three types

  • Many to many relationship.
  • Many to one relationship.
  • One to one relationship.

Many to Many Model

The many-to-many model multiplexes any number of user threads onto an equal or smaller number of kernel threads.

The following diagram shows the many-to-many threading model where 6 user level threads are multiplexing with 6 kernel level threads. In this model, developers can create as many user threads as necessary and the corresponding Kernel threads can run in parallel on a multiprocessor machine. This model provides the best accuracy on concurrency and when a thread performs a blocking system call, the kernel can schedule another thread for execution.

Many to many thread model

Many to One Model

Many-to-one model maps many user level threads to one Kernel-level thread. Thread management is done in user space by the thread library. When thread makes a blocking system call, the entire process will be blocked. Only one thread can access the Kernel at a time, so multiple threads are unable to run in parallel on multiprocessors.

If the user-level thread libraries are implemented in the operating system in such a way that the system does not support them, then the Kernel threads use the many-to-one relationship modes.

Many to one thread model

One to One Model

There is one-to-one relationship of user-level thread to the kernel-level thread. This model provides more concurrency than the many-to-one model. It also allows another thread to run when a thread makes a blocking system call. It supports multiple threads to execute in parallel on microprocessors.

Disadvantage of this model is that creating user thread requires the corresponding Kernel thread. OS/2, windows NT and windows 2000 use one to one relationship model.

One to one thread model

Difference between User-Level & Kernel-Level Thread

S.N.User-Level ThreadsKernel-Level Thread
1User-level threads are faster to create and manage.Kernel-level threads are slower to create and manage.
2Implementation is by a thread library at the user level.Operating system supports creation of Kernel threads.
3User-level thread is generic and can run on any operating system.Kernel-level thread is specific to the operating system.
4Multi-threaded applications cannot take advantage of multiprocessing.Kernel routines themselves can be multithreaded.

MEMORY MANAGEMENT 

 Memory management is the functionality of an operating system which handles or manages primary memory and moves processes back and forth between main memory and disk during execution. Memory management keeps track of each and every memory location, regardless of either it is allocated to some process or it is free. It checks how much memory is to be allocated to processes. It decides which process will get memory at what time. It tracks whenever some memory gets freed or unallocated and correspondingly it updates the status.

This tutorial will teach you basic concepts related to Memory Management.

Process Address Space

The process address space is the set of logical addresses that a process references in its code. For example, when 32-bit addressing is in use, addresses can range from 0 to 0x7fffffff; that is, 2^31 possible numbers, for a total theoretical size of 2 gigabytes.

The operating system takes care of mapping the logical addresses to physical addresses at the time of memory allocation to the program. There are three types of addresses used in a program before and after memory is allocated −

S.N.Memory Addresses & Description
1

Symbolic addresses

The addresses used in a source code. The variable names, constants, and instruction labels are the basic elements of the symbolic address space.

2

Relative addresses

At the time of compilation, a compiler converts symbolic addresses into relative addresses.

3

Physical addresses

The loader generates these addresses at the time when a program is loaded into main memory.

Virtual and physical addresses are the same in compile-time and load-time address-binding schemes. Virtual and physical addresses differ in execution-time address-binding scheme.

The set of all logical addresses generated by a program is referred to as a logical address space. The set of all physical addresses corresponding to these logical addresses is referred to as a physical address space.

The runtime mapping from virtual to physical address is done by the memory management unit (MMU) which is a hardware device. MMU uses following mechanism to convert virtual address to physical address.

  • The value in the base register is added to every address generated by a user process, which is treated as offset at the time it is sent to memory. For example, if the base register value is 10000, then an attempt by the user to use address location 100 will be dynamically reallocated to location 10100.

  • The user program deals with virtual addresses; it never sees the real physical addresses.

Static vs Dynamic Loading

The choice between Static or Dynamic Loading is to be made at the time of computer program being developed. If you have to load your program statically, then at the time of compilation, the complete programs will be compiled and linked without leaving any external program or module dependency. The linker combines the object program with other necessary object modules into an absolute program, which also includes logical addresses.

If you are writing a Dynamically loaded program, then your compiler will compile the program and for all the modules which you want to include dynamically, only references will be provided and rest of the work will be done at the time of execution.

At the time of loading, with static loading, the absolute program (and data) is loaded into memory in order for execution to start.

If you are using dynamic loading, dynamic routines of the library are stored on a disk in relocatable form and are loaded into memory only when they are needed by the program.

Static vs Dynamic Linking

As explained above, when static linking is used, the linker combines all other modules needed by a program into a single executable program to avoid any runtime dependency.

When dynamic linking is used, it is not required to link the actual module or library with the program, rather a reference to the dynamic module is provided at the time of compilation and linking. Dynamic Link Libraries (DLL) in Windows and Shared Objects in Unix are good examples of dynamic libraries.

Swapping

Swapping is a mechanism in which a process can be swapped temporarily out of main memory (or move) to secondary storage (disk) and make that memory available to other processes. At some later time, the system swaps back the process from the secondary storage to main memory.

Though performance is usually affected by swapping process but it helps in running multiple and big processes in parallel and that's the reason Swapping is also known as a technique for memory compaction.

Process Swapping

The total time taken by swapping process includes the time it takes to move the entire process to a secondary disk and then to copy the process back to memory, as well as the time the process takes to regain main memory.

Let us assume that the user process is of size 2048KB and on a standard hard disk where swapping will take place has a data transfer rate around 1 MB per second. The actual transfer of the 1000K process to or from memory will take

2048KB / 1024KB per second = 2 seconds = 2000 milliseconds 

Now considering in and out time, it will take complete 4000 milliseconds plus other overhead where the process competes to regain main memory.

Memory Allocation

Main memory usually has two partitions −

  • Low Memory − Operating system resides in this memory.

  • High Memory − User processes are held in high memory.

Operating system uses the following memory allocation mechanism.

S.N.Memory Allocation & Description
1

Single-partition allocation

In this type of allocation, relocation-register scheme is used to protect user processes from each other, and from changing operating-system code and data. Relocation register contains value of smallest physical address whereas limit register contains range of logical addresses. Each logical address must be less than the limit register.

2

Multiple-partition allocation

In this type of allocation, main memory is divided into a number of fixed-sized partitions where each partition should contain only one process. When a partition is free, a process is selected from the input queue and is loaded into the free partition. When the process terminates, the partition becomes available for another process.

Fragmentation

As processes are loaded and removed from memory, the free memory space is broken into little pieces. It happens after sometimes that processes cannot be allocated to memory blocks considering their small size and memory blocks remains unused. This problem is known as Fragmentation.

Fragmentation is of two types −

S.N.Fragmentation & Description
1

External fragmentation

Total memory space is enough to satisfy a request or to reside a process in it, but it is not contiguous, so it cannot be used.

2

Internal fragmentation

Memory block assigned to process is bigger. Some portion of memory is left unused, as it cannot be used by another process.

The following diagram shows how fragmentation can cause waste of memory and a compaction technique can be used to create more free memory out of fragmented memory −

Memory Fragmentation

External fragmentation can be reduced by compaction or shuffle memory contents to place all free memory together in one large block. To make compaction feasible, relocation should be dynamic.

The internal fragmentation can be reduced by effectively assigning the smallest partition but large enough for the process.

Paging

A computer can address more memory than the amount physically installed on the system. This extra memory is actually called virtual memory and it is a section of a hard that's set up to emulate the computer's RAM. Paging technique plays an important role in implementing virtual memory.

Paging is a memory management technique in which process address space is broken into blocks of the same size called pages (size is power of 2, between 512 bytes and 8192 bytes). The size of the process is measured in the number of pages.

Similarly, main memory is divided into small fixed-sized blocks of (physical) memory called frames and the size of a frame is kept the same as that of a page to have optimum utilization of the main memory and to avoid external fragmentation.

Paging

Address Translation

Page address is called logical address and represented by page number and the offset.

Logical Address = Page number + page offset 

Frame address is called physical address and represented by a frame number and the offset.

Physical Address = Frame number + page offset 

A data structure called page map table is used to keep track of the relation between a page of a process to a frame in physical memory.

Page Map Table

When the system allocates a frame to any page, it translates this logical address into a physical address and create entry into the page table to be used throughout execution of the program.

When a process is to be executed, its corresponding pages are loaded into any available memory frames. Suppose you have a program of 8Kb but your memory can accommodate only 5Kb at a given point in time, then the paging concept will come into picture. When a computer runs out of RAM, the operating system (OS) will move idle or unwanted pages of memory to secondary memory to free up RAM for other processes and brings them back when needed by the program.

This process continues during the whole execution of the program where the OS keeps removing idle pages from the main memory and write them onto the secondary memory and bring them back when required by the program.

Advantages and Disadvantages of Paging

Here is a list of advantages and disadvantages of paging −

  • Paging reduces external fragmentation, but still suffer from internal fragmentation.

  • Paging is simple to implement and assumed as an efficient memory management technique.

  • Due to equal size of the pages and frames, swapping becomes very easy.

  • Page table requires extra memory space, so may not be good for a system having small RAM.

Segmentation

Segmentation is a memory management technique in which each job is divided into several segments of different sizes, one for each module that contains pieces that perform related functions. Each segment is actually a different logical address space of the program.

When a process is to be executed, its corresponding segmentation are loaded into non-contiguous memory though every segment is loaded into a contiguous block of available memory.

Segmentation memory management works very similar to paging but here segments are of variable-length where as in paging pages are of fixed size.

A program segment contains the program's main function, utility functions, data structures, and so on. The operating system maintains a segment map table for every process and a list of free memory blocks along with segment numbers, their size and corresponding memory locations in main memory. For each segment, the table stores the starting address of the segment and the length of the segment. A reference to a memory location includes a value that identifies a segment and an offset.

Segment Map Table

 

 

 VIRTUAL MEMORY

 

 A computer can address more memory than the amount physically installed on the system. This extra memory is actually called virtual memory and it is a section of a hard disk that's set up to emulate the computer's RAM.

The main visible advantage of this scheme is that programs can be larger than physical memory. Virtual memory serves two purposes. First, it allows us to extend the use of physical memory by using disk. Second, it allows us to have memory protection, because each virtual address is translated to a physical address.

Following are the situations, when entire program is not required to be loaded fully in main memory.

  • User written error handling routines are used only when an error occurred in the data or computation.

  • Certain options and features of a program may be used rarely.

  • Many tables are assigned a fixed amount of address space even though only a small amount of the table is actually used.

  • The ability to execute a program that is only partially in memory would counter many benefits.

  • Less number of I/O would be needed to load or swap each user program into memory.

  • A program would no longer be constrained by the amount of physical memory that is available.

  • Each user program could take less physical memory, more programs could be run the same time, with a corresponding increase in CPU utilization and throughput.

Modern microprocessors intended for general-purpose use, a memory management unit, or MMU, is built into the hardware. The MMU's job is to translate virtual addresses into physical addresses. A basic example is given below −

Virtual Memory

Virtual memory is commonly implemented by demand paging. It can also be implemented in a segmentation system. Demand segmentation can also be used to provide virtual memory.

Demand Paging

A demand paging system is quite similar to a paging system with swapping where processes reside in secondary memory and pages are loaded only on demand, not in advance. When a context switch occurs, the operating system does not copy any of the old program’s pages out to the disk or any of the new program’s pages into the main memory Instead, it just begins executing the new program after loading the first page and fetches that program’s pages as they are referenced.

Demand Paging

While executing a program, if the program references a page which is not available in the main memory because it was swapped out a little ago, the processor treats this invalid memory reference as a page fault and transfers control from the program to the operating system to demand the page back into the memory.

Advantages

Following are the advantages of Demand Paging −

  • Large virtual memory.
  • More efficient use of memory.
  • There is no limit on degree of multiprogramming.

Disadvantages

  • Number of tables and the amount of processor overhead for handling page interrupts are greater than in the case of the simple paged management techniques.

Page Replacement Algorithm

Page replacement algorithms are the techniques using which an Operating System decides which memory pages to swap out, write to disk when a page of memory needs to be allocated. Paging happens whenever a page fault occurs and a free page cannot be used for allocation purpose accounting to reason that pages are not available or the number of free pages is lower than required pages.

When the page that was selected for replacement and was paged out, is referenced again, it has to read in from disk, and this requires for I/O completion. This process determines the quality of the page replacement algorithm: the lesser the time waiting for page-ins, the better is the algorithm.

A page replacement algorithm looks at the limited information about accessing the pages provided by hardware, and tries to select which pages should be replaced to minimize the total number of page misses, while balancing it with the costs of primary storage and processor time of the algorithm itself. There are many different page replacement algorithms. We evaluate an algorithm by running it on a particular string of memory reference and computing the number of page faults,

Reference String

The string of memory references is called reference string. Reference strings are generated artificially or by tracing a given system and recording the address of each memory reference. The latter choice produces a large number of data, where we note two things.

  • For a given page size, we need to consider only the page number, not the entire address.

  • If we have a reference to a page p, then any immediately following references to page p will never cause a page fault. Page p will be in memory after the first reference; the immediately following references will not fault.

  • For example, consider the following sequence of addresses − 123,215,600,1234,76,96

  • If page size is 100, then the reference string is 1,2,6,12,0,0

First In First Out (FIFO) algorithm

  • Oldest page in main memory is the one which will be selected for replacement.

  • Easy to implement, keep a list, replace pages from the tail and add new pages at the head.

First In First Out

Optimal Page algorithm

  • An optimal page-replacement algorithm has the lowest page-fault rate of all algorithms. An optimal page-replacement algorithm exists, and has been called OPT or MIN.

  • Replace the page that will not be used for the longest period of time. Use the time when a page is to be used.

Optimal page replacement

Least Recently Used (LRU) algorithm

  • Page which has not been used for the longest time in main memory is the one which will be selected for replacement.

  • Easy to implement, keep a list, replace pages by looking back into time.

Least Recently Used

Page Buffering algorithm

  • To get a process start quickly, keep a pool of free frames.
  • On page fault, select a page to be replaced.
  • Write the new page in the frame of free pool, mark the page table and restart the process.
  • Now write the dirty page out of disk and place the frame holding replaced page in free pool.

Least frequently Used(LFU) algorithm

  • The page with the smallest count is the one which will be selected for replacement.

  • This algorithm suffers from the situation in which a page is used heavily during the initial phase of a process, but then is never used again.

Most frequently Used(MFU) algorithm

  • This algorithm is based on the argument that the page with the smallest count was probably just brought in and has yet to be used.

 

 I/O HARDWARE

 

 One of the important jobs of an Operating System is to manage various I/O devices including mouse, keyboards, touch pad, disk drives, display adapters, USB devices, Bit-mapped screen, LED, Analog-to-digital converter, On/off switch, network connections, audio I/O, printers etc.

An I/O system is required to take an application I/O request and send it to the physical device, then take whatever response comes back from the device and send it to the application. I/O devices can be divided into two categories −

  • Block devices − A block device is one with which the driver communicates by sending entire blocks of data. For example, Hard disks, USB cameras, Disk-On-Key etc.

  • Character devices − A character device is one with which the driver communicates by sending and receiving single characters (bytes, octets). For example, serial ports, parallel ports, sounds cards etc

Device Controllers

Device drivers are software modules that can be plugged into an OS to handle a particular device. Operating System takes help from device drivers to handle all I/O devices.

The Device Controller works like an interface between a device and a device driver. I/O units (Keyboard, mouse, printer, etc.) typically consist of a mechanical component and an electronic component where electronic component is called the device controller.

There is always a device controller and a device driver for each device to communicate with the Operating Systems. A device controller may be able to handle multiple devices. As an interface its main task is to convert serial bit stream to block of bytes, perform error correction as necessary.

Any device connected to the computer is connected by a plug and socket, and the socket is connected to a device controller. Following is a model for connecting the CPU, memory, controllers, and I/O devices where CPU and device controllers all use a common bus for communication.

Device Controllers

Synchronous vs asynchronous I/O

  • Synchronous I/O − In this scheme CPU execution waits while I/O proceeds

  • Asynchronous I/O − I/O proceeds concurrently with CPU execution

Communication to I/O Devices

The CPU must have a way to pass information to and from an I/O device. There are three approaches available to communicate with the CPU and Device.

  • Special Instruction I/O
  • Memory-mapped I/O
  • Direct memory access (DMA)

Special Instruction I/O

This uses CPU instructions that are specifically made for controlling I/O devices. These instructions typically allow data to be sent to an I/O device or read from an I/O device.

Memory-mapped I/O

When using memory-mapped I/O, the same address space is shared by memory and I/O devices. The device is connected directly to certain main memory locations so that I/O device can transfer block of data to/from memory without going through CPU.

Memory-mapped I/O

While using memory mapped IO, OS allocates buffer in memory and informs I/O device to use that buffer to send data to the CPU. I/O device operates asynchronously with CPU, interrupts CPU when finished.

The advantage to this method is that every instruction which can access memory can be used to manipulate an I/O device. Memory mapped IO is used for most high-speed I/O devices like disks, communication interfaces.

Direct Memory Access (DMA)

Slow devices like keyboards will generate an interrupt to the main CPU after each byte is transferred. If a fast device such as a disk generated an interrupt for each byte, the operating system would spend most of its time handling these interrupts. So a typical computer uses direct memory access (DMA) hardware to reduce this overhead.

Direct Memory Access (DMA) means CPU grants I/O module authority to read from or write to memory without involvement. DMA module itself controls exchange of data between main memory and the I/O device. CPU is only involved at the beginning and end of the transfer and interrupted only after entire block has been transferred.

Direct Memory Access needs a special hardware called DMA controller (DMAC) that manages the data transfers and arbitrates access to the system bus. The controllers are programmed with source and destination pointers (where to read/write the data), counters to track the number of transferred bytes, and settings, which includes I/O and memory types, interrupts and states for the CPU cycles.

DMA

The operating system uses the DMA hardware as follows −

StepDescription
1Device driver is instructed to transfer disk data to a buffer address X.
2Device driver then instruct disk controller to transfer data to buffer.
3Disk controller starts DMA transfer.
4Disk controller sends each byte to DMA controller.
5DMA controller transfers bytes to buffer, increases the memory address, decreases the counter C until C becomes zero.
6When C becomes zero, DMA interrupts CPU to signal transfer completion.

Polling vs Interrupts I/O

A computer must have a way of detecting the arrival of any type of input. There are two ways that this can happen, known as polling and interrupts. Both of these techniques allow the processor to deal with events that can happen at any time and that are not related to the process it is currently running.

Polling I/O

Polling is the simplest way for an I/O device to communicate with the processor. The process of periodically checking status of the device to see if it is time for the next I/O operation, is called polling. The I/O device simply puts the information in a Status register, and the processor must come and get the information.

Most of the time, devices will not require attention and when one does it will have to wait until it is next interrogated by the polling program. This is an inefficient method and much of the processors time is wasted on unnecessary polls.

Compare this method to a teacher continually asking every student in a class, one after another, if they need help. Obviously the more efficient method would be for a student to inform the teacher whenever they require assistance.

Interrupts I/O

An alternative scheme for dealing with I/O is the interrupt-driven method. An interrupt is a signal to the microprocessor from a device that requires attention.

A device controller puts an interrupt signal on the bus when it needs CPU’s attention when CPU receives an interrupt, It saves its current state and invokes the appropriate interrupt handler using the interrupt vector (addresses of OS routines to handle various events). When the interrupting device has been dealt with, the CPU continues with its original task as if it had never been interrupted.

 

I/O SOFTWARE 

 

 I/O software is often organized in the following layers −

  • User Level Libraries − This provides simple interface to the user program to perform input and output. For example, stdio is a library provided by C and C++ programming languages.

  • Kernel Level Modules − This provides device driver to interact with the device controller and device independent I/O modules used by the device drivers.

  • Hardware − This layer includes actual hardware and hardware controller which interact with the device drivers and makes hardware alive.

A key concept in the design of I/O software is that it should be device independent where it should be possible to write programs that can access any I/O device without having to specify the device in advance. For example, a program that reads a file as input should be able to read a file on a floppy disk, on a hard disk, or on a CD-ROM, without having to modify the program for each different device.

I/O Softwares

Device Drivers

Device drivers are software modules that can be plugged into an OS to handle a particular device. Operating System takes help from device drivers to handle all I/O devices. Device drivers encapsulate device-dependent code and implement a standard interface in such a way that code contains device-specific register reads/writes. Device driver, is generally written by the device's manufacturer and delivered along with the device on a CD-ROM.

A device driver performs the following jobs −

  • To accept request from the device independent software above to it.
  • Interact with the device controller to take and give I/O and perform required error handling
  • Making sure that the request is executed successfully

How a device driver handles a request is as follows: Suppose a request comes to read a block N. If the driver is idle at the time a request arrives, it starts carrying out the request immediately. Otherwise, if the driver is already busy with some other request, it places the new request in the queue of pending requests.

Interrupt handlers

An interrupt handler, also known as an interrupt service routine or ISR, is a piece of software or more specifically a callback function in an operating system or more specifically in a device driver, whose execution is triggered by the reception of an interrupt.

When the interrupt happens, the interrupt procedure does whatever it has to in order to handle the interrupt, updates data structures and wakes up process that was waiting for an interrupt to happen.

The interrupt mechanism accepts an address ─ a number that selects a specific interrupt handling routine/function from a small set. In most architectures, this address is an offset stored in a table called the interrupt vector table. This vector contains the memory addresses of specialized interrupt handlers.

Device-Independent I/O Software

The basic function of the device-independent software is to perform the I/O functions that are common to all devices and to provide a uniform interface to the user-level software. Though it is difficult to write completely device independent software but we can write some modules which are common among all the devices. Following is a list of functions of device-independent I/O Software −

  • Uniform interfacing for device drivers
  • Device naming - Mnemonic names mapped to Major and Minor device numbers
  • Device protection
  • Providing a device-independent block size
  • Buffering because data coming off a device cannot be stored in final destination.
  • Storage allocation on block devices
  • Allocation and releasing dedicated devices
  • Error Reporting

User-Space I/O Software

These are the libraries which provide richer and simplified interface to access the functionality of the kernel or ultimately interactive with the device drivers. Most of the user-level I/O software consists of library procedures with some exception like spooling system which is a way of dealing with dedicated I/O devices in a multiprogramming system.

I/O Libraries (e.g., stdio) are in user-space to provide an interface to the OS resident device-independent I/O SW. For example putchar(), getchar(), printf() and scanf() are example of user level I/O library stdio available in C programming.

Kernel I/O Subsystem

Kernel I/O Subsystem is responsible to provide many services related to I/O. Following are some of the services provided.

  • Scheduling − Kernel schedules a set of I/O requests to determine a good order in which to execute them. When an application issues a blocking I/O system call, the request is placed on the queue for that device. The Kernel I/O scheduler rearranges the order of the queue to improve the overall system efficiency and the average response time experienced by the applications.

  • Buffering − Kernel I/O Subsystem maintains a memory area known as buffer that stores data while they are transferred between two devices or between a device with an application operation. Buffering is done to cope with a speed mismatch between the producer and consumer of a data stream or to adapt between devices that have different data transfer sizes.

  • Caching − Kernel maintains cache memory which is region of fast memory that holds copies of data. Access to the cached copy is more efficient than access to the original.

  • Spooling and Device Reservation − A spool is a buffer that holds output for a device, such as a printer, that cannot accept interleaved data streams. The spooling system copies the queued spool files to the printer one at a time. In some operating systems, spooling is managed by a system daemon process. In other operating systems, it is handled by an in kernel thread.

  • Error Handling − An operating system that uses protected memory can guard against many kinds of hardware and application errors.

 

FILE SYSTEM  

File

A file is a named collection of related information that is recorded on secondary storage such as magnetic disks, magnetic tapes and optical disks. In general, a file is a sequence of bits, bytes, lines or records whose meaning is defined by the files creator and user.

File Structure

A File Structure should be according to a required format that the operating system can understand.

  • A file has a certain defined structure according to its type.

  • A text file is a sequence of characters organized into lines.

  • A source file is a sequence of procedures and functions.

  • An object file is a sequence of bytes organized into blocks that are understandable by the machine.

  • When operating system defines different file structures, it also contains the code to support these file structure. Unix, MS-DOS support minimum number of file structure.

File Type

File type refers to the ability of the operating system to distinguish different types of file such as text files source files and binary files etc. Many operating systems support many types of files. Operating system like MS-DOS and UNIX have the following types of files −

Ordinary files

  • These are the files that contain user information.
  • These may have text, databases or executable program.
  • The user can apply various operations on such files like add, modify, delete or even remove the entire file.

Directory files

  • These files contain list of file names and other information related to these files.

Special files

  • These files are also known as device files.
  • These files represent physical device like disks, terminals, printers, networks, tape drive etc.

These files are of two types −

  • Character special files − data is handled character by character as in case of terminals or printers.

  • Block special files − data is handled in blocks as in the case of disks and tapes.

File Access Mechanisms

File access mechanism refers to the manner in which the records of a file may be accessed. There are several ways to access files −

  • Sequential access
  • Direct/Random access
  • Indexed sequential access

Sequential access

A sequential access is that in which the records are accessed in some sequence, i.e., the information in the file is processed in order, one record after the other. This access method is the most primitive one. Example: Compilers usually access files in this fashion.

Direct/Random access

  • Random access file organization provides, accessing the records directly.

  • Each record has its own address on the file with by the help of which it can be directly accessed for reading or writing.

  • The records need not be in any sequence within the file and they need not be in adjacent locations on the storage medium.

Indexed sequential access

  • This mechanism is built up on base of sequential access.
  • An index is created for each file which contains pointers to various blocks.
  • Index is searched sequentially and its pointer is used to access the file directly.

Space Allocation

Files are allocated disk spaces by operating system. Operating systems deploy following three main ways to allocate disk space to files.

  • Contiguous Allocation
  • Linked Allocation
  • Indexed Allocation

Contiguous Allocation

  • Each file occupies a contiguous address space on disk.
  • Assigned disk address is in linear order.
  • Easy to implement.
  • External fragmentation is a major issue with this type of allocation technique.

Linked Allocation

  • Each file carries a list of links to disk blocks.
  • Directory contains link / pointer to first block of a file.
  • No external fragmentation
  • Effectively used in sequential access file.
  • Inefficient in case of direct access file.

Indexed Allocation

  • Provides solutions to problems of contiguous and linked allocation.
  • A index block is created having all pointers to files.
  • Each file has its own index block which stores the addresses of disk space occupied by the file.
  • Directory contains the addresses of index blocks of files.

 

 SECURITY

Security refers to providing a protection system to computer system resources such as CPU, memory, disk, software programs and most importantly data/information stored in the computer system. If a computer program is run by an unauthorized user, then he/she may cause severe damage to computer or data stored in it. So a computer system must be protected against unauthorized access, malicious access to system memory, viruses, worms etc. We're going to discuss following topics in this chapter.

  • Authentication
  • One Time passwords
  • Program Threats
  • System Threats
  • Computer Security Classifications

Authentication

Authentication refers to identifying each user of the system and associating the executing programs with those users. It is the responsibility of the Operating System to create a protection system which ensures that a user who is running a particular program is authentic. Operating Systems generally identifies/authenticates users using following three ways −

  • Username / Password − User need to enter a registered username and password with Operating system to login into the system.

  • User card/key − User need to punch card in card slot, or enter key generated by key generator in option provided by operating system to login into the system.

  • User attribute - fingerprint/ eye retina pattern/ signature − User need to pass his/her attribute via designated input device used by operating system to login into the system.

One Time passwords

One-time passwords provide additional security along with normal authentication. In One-Time Password system, a unique password is required every time user tries to login into the system. Once a one-time password is used, then it cannot be used again. One-time password are implemented in various ways.

  • Random numbers − Users are provided cards having numbers printed along with corresponding alphabets. System asks for numbers corresponding to few alphabets randomly chosen.

  • Secret key − User are provided a hardware device which can create a secret id mapped with user id. System asks for such secret id which is to be generated every time prior to login.

  • Network password − Some commercial applications send one-time passwords to user on registered mobile/ email which is required to be entered prior to login.

Program Threats

Operating system's processes and kernel do the designated task as instructed. If a user program made these process do malicious tasks, then it is known as Program Threats. One of the common example of program threat is a program installed in a computer which can store and send user credentials via network to some hacker. Following is the list of some well-known program threats.

  • Trojan Horse − Such program traps user login credentials and stores them to send to malicious user who can later on login to computer and can access system resources.

  • Trap Door − If a program which is designed to work as required, have a security hole in its code and perform illegal action without knowledge of user then it is called to have a trap door.

  • Logic Bomb − Logic bomb is a situation when a program misbehaves only when certain conditions met otherwise it works as a genuine program. It is harder to detect.

  • Virus − Virus as name suggest can replicate themselves on computer system. They are highly dangerous and can modify/delete user files, crash systems. A virus is generatlly a small code embedded in a program. As user accesses the program, the virus starts getting embedded in other files/ programs and can make system unusable for user

System Threats

System threats refers to misuse of system services and network connections to put user in trouble. System threats can be used to launch program threats on a complete network called as program attack. System threats creates such an environment that operating system resources/ user files are misused. Following is the list of some well-known system threats.

  • Worm − Worm is a process which can choked down a system performance by using system resources to extreme levels. A Worm process generates its multiple copies where each copy uses system resources, prevents all other processes to get required resources. Worms processes can even shut down an entire network.

  • Port Scanning − Port scanning is a mechanism or means by which a hacker can detects system vulnerabilities to make an attack on the system.

  • Denial of Service − Denial of service attacks normally prevents user to make legitimate use of the system. For example, a user may not be able to use internet if denial of service attacks browser's content settings.

Computer Security Classifications

As per the U.S. Department of Defense Trusted Computer System's Evaluation Criteria there are four security classifications in computer systems: A, B, C, and D. This is widely used specifications to determine and model the security of systems and of security solutions. Following is the brief description of each classification.

S.N.Classification Type & Description
1

Type A

Highest Level. Uses formal design specifications and verification techniques. Grants a high degree of assurance of process security.

2

Type B

Provides mandatory protection system. Have all the properties of a class C2 system. Attaches a sensitivity label to each object. It is of three types.

  • B1 − Maintains the security label of each object in the system. Label is used for making decisions to access control.

  • B2 − Extends the sensitivity labels to each system resource, such as storage objects, supports covert channels and auditing of events.

  • B3 − Allows creating lists or user groups for access-control to grant access or revoke access to a given named object.

3

Type C

Provides protection and user accountability using audit capabilities. It is of two types.

  • C1 − Incorporates controls so that users can protect their private information and keep other users from accidentally reading / deleting their data. UNIX versions are mostly Cl class.

  • C2 − Adds an individual-level access control to the capabilities of a Cl level system.

4

Type D

Lowest level. Minimum protection. MS-DOS, Window 3.1 fall in this category.

 

LUNIX

 

 Linux is one of popular version of UNIX operating System. It is open source as its source code is freely available. It is free to use. Linux was designed considering UNIX compatibility. Its functionality list is quite similar to that of UNIX.

Components of Linux System

Linux Operating System has primarily three components

  • Kernel − Kernel is the core part of Linux. It is responsible for all major activities of this operating system. It consists of various modules and it interacts directly with the underlying hardware. Kernel provides the required abstraction to hide low level hardware details to system or application programs.

  • System Library − System libraries are special functions or programs using which application programs or system utilities accesses Kernel's features. These libraries implement most of the functionalities of the operating system and do not requires kernel module's code access rights.

  • System Utility − System Utility programs are responsible to do specialized, individual level tasks.

Linux Operating System

Kernel Mode vs User Mode

Kernel component code executes in a special privileged mode called kernel mode with full access to all resources of the computer. This code represents a single process, executes in single address space and do not require any context switch and hence is very efficient and fast. Kernel runs each processes and provides system services to processes, provides protected access to hardware to processes.

Support code which is not required to run in kernel mode is in System Library. User programs and other system programs works in User Mode which has no access to system hardware and kernel code. User programs/ utilities use System libraries to access Kernel functions to get system's low level tasks.

Basic Features

Following are some of the important features of Linux Operating System.

  • Portable − Portability means software can works on different types of hardware in same way. Linux kernel and application programs supports their installation on any kind of hardware platform.

  • Open Source − Linux source code is freely available and it is community based development project. Multiple teams work in collaboration to enhance the capability of Linux operating system and it is continuously evolving.

  • Multi-User − Linux is a multiuser system means multiple users can access system resources like memory/ ram/ application programs at same time.

  • Multiprogramming − Linux is a multiprogramming system means multiple applications can run at same time.

  • Hierarchical File System − Linux provides a standard file structure in which system files/ user files are arranged.

  • Shell − Linux provides a special interpreter program which can be used to execute commands of the operating system. It can be used to do various types of operations, call application programs. etc.

  • Security − Linux provides user security using authentication features like password protection/ controlled access to specific files/ encryption of data.

Architecture

The following illustration shows the architecture of a Linux system −

Linux Operating System Architecture

The architecture of a Linux System consists of the following layers −

  • Hardware layer − Hardware consists of all peripheral devices (RAM/ HDD/ CPU etc).

  • Kernel − It is the core component of Operating System, interacts directly with hardware, provides low level services to upper layer components.

  • Shell − An interface to kernel, hiding complexity of kernel's functions from users. The shell takes commands from the user and executes kernel's functions.

  • Utilities − Utility programs that provide the user most of the functionalities of an operating systems. 

  

Introduction To Online Basic Electronics For High School and University Level

 

SiSTEC ELECTRONICS PRACTICALS  

basic-electronics-makerspaces

Learning about basic electronics and creating your own projects is a lot easier than you may think.  In this tutorial, we’re going to give you a brief overview of common electronic components and explain what their functions are.  You will then learn about schematic diagrams and how they are used to design and build circuits.  And finally, you will put this information to use by creating your first basic circuit.


Electronic Workbench

Before you get started, make sure your electronic workbench is properly set up.  The work area doesn’t need to be fancy and you could even build your own electronic workbench.

makerspace workbench

Storage

Electronic components can be small and it’s a good idea to keep everything organized.  The most popular option is to use clear plastic storage boxes for storing parts.  In addition,  you could use plastic storage bins that hang from a rack or fit on a shelf.

storage basic electronics

Tools

Now that you have a good workspace set up, it’s time to stock it with the proper tools and equipment.  This isn’t a complete list but it does highlight the most common items used in electronics.

Breadboard

Breadboards are an essential tool for prototyping and building temporary circuits.  These boards contain holes for inserting wire and components.  Because of their temporary nature, they allow you to create circuits without soldering.  The holes in a breadboard are connected in rows both horizontally and vertically as shown below.

arduino breadboard connection direction

Digital Multimeter

A multimeter is a device that’s used to measure electric current (amps), voltage (volts) and resistance (ohms).  It’s a great for troubleshooting circuits and is capable of measuring both AC and DC voltage.  Check out this post for more info on how to use a multimeter.

multimeter basic electronics

Battery Holders

A battery holder is a plastic case that holds batteries from 9V to AA.  Some holders are enclosed and may have an on/off switch built in.

battery holder basic electronics

Test Leads (Alligator Clips)

Test leads are great for connecting components together to test a circuit without the need for soldering.

test leads basic electronics

Wire Cutter

Wire cutters are essential for stripping stranded and solid copper wire.

wire cutters basic electronics

Precision Screwdriver Set

Precision screwdrivers are also known as jeweler’s screwdrivers and usually come as a set.  The advantage of these over normal screwdrivers is the precision tips of each driver.  These are very handy when working with electronics that contain tiny screws.

precision-screwdriver

Helping 3rd Hand

When working with electronics, it seems you never have enough hands to hold everything.  This is where the helping hand (3rd hand) comes in.  Great for holding circuit boards or wire when soldering or tinning.

helping-hand-third-hand

Heat Gun

A heat gun is used to shrink plastic tubing known as heat shrink to help protect exposed wire.  Heat shrink has been called the duct tape of electronics and comes in handy in a wide variety of applications.

heat gun shrink

Jumper Wire

These wires are used with breadboard and development boards and are generally 22-28 AWG solid core wire.  Jumper wires can have male or female ends depending on how they need to be used.

breadboard jumper wires for arduino uno board

Soldering Iron

When it time to create a permanent circuit, you’ll want to solder the parts together.  To do this, a soldering iron is the tool you would use.  Of course a soldering iron isn’t any good unless you have solder to go with it.  You can choose leaded or lead-free solder in a few diameters.

how to solder soldering iron station

Electronic Components

Now its time to talk about the different components that make your electronic projects come to life.  Below is a quick breakdown of the most common components and functions they perform.

Switch 

Switches can come in many forms such as pushbutton, rocker, momentary and others.  Their basic function is to interrupt electric current by turning a circuit on or off.

switches basic electronics

Resistor

Resistors are used to resist the flow of current or to control the voltage in a circuit.  The amount of resistance that a resistor offers is measured in Ohms.  Most resistors have colored stripes on the outside and this code will tell you it’s value of resistance.  You can use a multimeter or Digikey’s resistor color code calculator to determine the value of a resistor.

resistors basic electronics

Variable Resistor (Potentiometer)

A variable resistor is also known as a potentiometer.  These components can be found in devices such as a light dimmer or volume control for a radio.   When you turn the shaft of a potentiometer the resistance changes in the circuit.

potentiometer basic electronics

Light-Dependent Resistor (LDR)

A light-dependent resistor is also a variable resistor but is controlled by the light versus turning a knob.  The resistance in the circuit changes with the intensity of the light.  These are often found in exterior lights that automatically turn on at dusk and off at dawn.

light dependent resistor LDR

Capacitor

Capacitors store electricity and then discharges it back into the circuit when there is a drop in voltage.  A capacitor is like a rechargeable battery and can be charged and then discharged.  The value is measured in F (Farad), nano Farad (nF) or pico Farad (pF) range.

capacitor basic electronics

Diode

A diode allows electricity to flow in one direction and blocks it from flowing the opposite way.  The diode’s primary role is to route electricity from taking an unwanted path within the circuit.

diode basic electronics

Light-Emitting Diode (LED)

A light-emitting diode is like a standard diode in the fact that electrical current only flows in one direction.  The main difference is an LED will emit light when electricity flows through it.  Inside an LED there is an anode and cathode.  Current always flows from the anode (+) to the cathode (-) and