Tuesday, 16 July 2013

AT & T and Bell laboratories

AT&T has averaged over two global patents issued per business day since the inception of AT&T Labs. The goal is to continue to create value for AT&T’s customers and the company through unmatched innovation.

Monday, 15 July 2013

Query Processor of Google


A multi-stage query processing system and method enables multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. At one or more stages of a multi-stage query processing system a set of relevancy scores are used to select a subset of documents for presentation as an ordered list to a user. The set of relevancy scores can be derived in part from one or more sets of relevancy scores determined in prior stages of the multi-stage query processing system. In some embodiments, the multi-stage query processing system is capable of executing one or more passes on a user query, and using information from each pass to expand the user query for use in a subsequent pass to improve the relevancy of documents in the ordered list.

Query Processing Phases

There are two major phases in query processing: query optimization and query execution.
Query optimization is the process of choosing the fastest execution plan. In the optimization phase, the query processor chooses:
  • Which, if any, indexes to use. 
  • The order in which joins are executed. 
  • The order in which constraints such as WHERE clauses are applied. 
  • Which algorithms are likely to lead to the best performance, based on cost information derived from statistics. 
Query execution is the process of executing the plan chosen during query optimization. The query execution component also determines the techniques available to the query optimizer. For example, SQL Server implements a hash join algorithm and a merge join algorithm, both of which are available to the query optimizer.
The query optimizer is the brain of a relational database system, enabling it to work intelligently and efficiently.
A relational database with a sophisticated query optimizer is more likely to complete a query, especially a complex query, faster than a relational database with a simple query optimizer.

Types of Query Optimizers


There are two major types of query optimizers in relational databases: syntax-based and cost-based.
Syntax-based Query Optimizers

A syntax-based query optimizer creates a procedural plan for obtaining the answer to an SQL query, but the particular plan it chooses is dependent on the exact syntax of the query and the order of the clauses within the query. A syntax-based query optimizer executes the same plan every time, regardless of whether the number or composition of records in the database changes over time. Unlike a cost-based query optimizer, it neither maintains nor considers statistics about the database.
Cost-based Query Optimizers


A cost-based query optimizer chooses among alternative plans to answer an SQL query. Selection is based on cost estimates for different plans. The factors in making cost estimates include the number of I/O operations, the amount of CPU time, and so on. A cost-based query optimizer estimates these costs by keeping statistics about the number and composition of records in a table or index and is not dependent on the exact syntax of the query or the order of clauses within the query (unlike a syntax-based query optimizer).

How does a web crawler works?



The first thing you need to understand is what a Web Crawler or Spider is and how it works. A Search Engine Spider (also known as a crawler, Robot, SearchBot or simply a Bot) is a program that most search engines use to find what’s new on the Internet. Google’s web crawler is known as GoogleBot. There are many types of web spiders in use, but for now, we’re only interested in the Bot that actually “crawls” the web and collects documents to build a searchable index for the different search engines. The program starts at a website and follows every hyperlink on each page. So we can say that everything on the web will eventually be found and spidered, as the so called “spider” crawls from one website to another. Search engines may run thousands of instances of their web crawling programs simultaneously, on multiple servers. When a web crawler visits one of your pages, it loads the site’s content into a database. Once a page has been fetched, the text of your page is loaded into the search engine’s index, which is a massive database of words, and where they occur on different web pages. All of this may sound too technical for most people, but it’s important to understand the basics of how a Web Crawler works.


So there are basically three steps that are involved in the web crawling procedure. First, the search bot starts by crawling the pages of your site. Then it continues indexing the words and content of the site, and finally it visit the links (web page addresses or URLs) that are found in your site. When the spider doesn’t find a page, it will eventually be deleted from the index. However, some of the spiders will check again for a second time to verify that the page really is offline.
The first thing a spider is supposed to do when it visits your website is look for a file called “robots.txt”. This file contains instructions for the spider on which parts of the website to index, and which parts to ignore. The only way to control what a spider sees on your site is by using a robots.txt file. All spiders are supposed to follow some rules, and the major search engines do follow these rules for the most part. Fortunately, the major search engines like Google or Bing are finally working together on standards.
for more info:

How Google Works?



Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:
  • Googlebot, a web crawler that finds and fetches web pages.
  • The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
  • The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.

1. Googlebot, Google’s Web Crawler
Let’s take a closer look at each part.
Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It’s easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn’t traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, then handing it off to Google’s indexer.
Googlebot consists of many computers requesting and fetching pages much more quickly than you can with your web browser. In fact, Googlebot can request thousands of different pages simultaneously. To avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately makes requests of each individual web server more slowly than it’s capable of doing.

2. Google’s Indexer

Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google’s index database. This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs. This data structure allows rapid access to documents that contain user query terms.
To improve search performance, Google ignores (doesn’t index) common words called stop words (such as theisonorofhowwhy, as well as certain single digits and single letters). Stop words are so common that they do little to narrow a search, and therefore they can safely be discarded. The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve Google’s performance.

3. Google’s Query Processor

The query processor has several parts, including the user interface (search box), the “engine” that evaluates queries and matches them to relevant documents, and the results formatter.
PageRank is Google’s system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank.
Google considers over a hundred factors in computing a PageRank and determining which documents are most relevant to a query, including the popularity of the page, the position and size of the search terms within the page, and the proximity of the search terms to one another on the page. A patent application discusses other factors that Google considers when ranking a page. Visit SEOmoz.org’s report for an interpretation of the concepts and the practical applications contained in Google’s patent application.
Google also applies machine-learning techniques to improve its performance automatically by learning relationships and associations within the stored data. For example, the spelling-correcting system uses such techniques to figure out likely alternative spellings. Google closely guards the formulas it uses to calculate relevance; they’re tweaked to improve quality and performance, and to outwit the latest devious techniques used by spammers.
Indexing the full text of the web allows Google to go beyond simply matching single search terms. Google gives more priority to pages that have search terms near each other and in the same order as the query. Google can also match multi-word phrases and sentences. Since Google indexes HTML code in addition to the text on the page, users can restrict searches on the basis of where query words appear, e.g., in the title, in the URL, in the body, and in links to the page, options offered by Google’s Advanced Search Form and Using Search Operators (Advanced Operators).



Friday, 5 July 2013

Al Jazeera : Fighting the fifth dimension an overview



Al Jazeera is a broadcasting company based on Qatar, but their packaging is mostly western. With this write-up I intend to discuss a documentary brought out by the Al Jazeera about Cyber warfare.

Using just a computer countries can now control and penetrate other countries with ease. Even without manpower and so on. Governments of some countries use social media such as Facebook to recruit people, even as spies and noone can actually control this. An understood fact is that through social media the information that comes out almost cannot be filtered. Julian Assange of Wiki Leaks is of the opinion that "Facebook was the most appalling spying machine ever invented."
Hacking kits are available easily on the internet. A group called Anonymous launched hacking attacks on MasterCard, Amazon and so on in countries like Tunisia, Algeria and Libya, using LOIC which is a hacking kit that is easily available online. Countries are increasingly understanding the importance of the Cyber world. Obama has described Cyber space as a strategic national aspect. Iran has developed a Cyber Police.
Iran in 2009 used social media like Twitter and Facebook to mobilize people for protests against the presidential election (an example of Cyber Activism). 
Israel is technologically advanced, they have almost 8000 pHD holders in the field of GSM alone. This is astonishing considering the small size of their country. In 2010, Lebanese found Israeli spy devices in their territory. It was like they had extended their network to this neighboring country.
There are many examples related to this. Such as Echelon which was developed originally by the US during the Cold war. This can be used to turn any phone into a microphone using which anything that happens around that particular phone can be recorded and monitored even if the phone was switched off. Another example of cyber warfare is the monitoring and killing of Osama Bin Laden by the US.

To watch the video:

Technical Evolution in Hollywood

Technical Evolution in Hollywood

The name Hollywood:


By the 1870s an agricultural community flourished in the area and crops ranging from hay and grain to subtropical bananas and pineapples were thriving. During the 1880s, the Ranchos were subdivided  In 1886, H. H. Wilcox bought an area of Rancho La Brea that his wife then christened "Hollywood." Within a few years, Wilcox had devised a grid plan for his new community, paved Prospect Avenue (now Hollywood Boulevard) for his main street and was selling large residential lots to wealthy Midwesterners looking to build homes so they could "winter in California."


Technological History:


This first phase of the motion pictures, in the late 1890s and into the 1900s, emphasized reproducing human motion. The second phase, telling a story, began to emerge around 1900. Film makers moved beyond the technical aspects of just showing motino and began to tell stories. Edwin Porter’s 1903 film, "The Great Train Robbery" is a good example of the story telling nature of films. It is the story of a robbery, with a chase scene and the inevitable capture of the robbers.

These early films were quite short, running 5 to 8 minutes long; they were called "one reelers" (they were just one reel of film). In the U.S., these films were produced by a handful of small companies just outside of New York City (Biograph, Essenay, Lubin, Pathe Brothers, Selig, Polyscope, Vitagraph, Edison and Melies).

One of the more dynamic early directors was David Wark Griffith. He worked for Biograph in New Jersey and produced literally hundreds of one-reelres in the period from 1908 to 1912. A director like Griffith might be expected to produce at least two one-reel movies a week. The names of the actors were not released, for fear they would become stars and want higher salaries.

One early Griffith film was "The Lonedale Operator," in 1911. It starred Blanche Sweet; she outsmarted the desperados. This video demonstrates some of Griffith’s innovative techniques, including cross cutting (cutting from one scene to another scene, and then back and forth, to develop various parts of a story and to build suspense) and closeups. Some early movie company owners objected to closeups, arguing that paying movie viewers would want to see the ENTIRE person. Closeups, however, could bring drama.

Griffith and others in the industry wanted to move beyond the simple formula that characterized the industry in the early 1900s. But industry owners were resistant, wanting to keep to one-reelers and limited story telling. These owners monopolized the industry, thorugh patents on key machinery and cameras and through control over distribution.

Consequently, the dissidents left the East completely and moved about as far away as they could get -- to Los Angeles. Well, to a rural area near Los Angeles -- where there the weather was good (lots of sunshine, little rain, so ideal for outside movie work) and plentiful barns (on farms) for inside work. This was Beverly Hills. In Hollywood, Griffith and others began to experiment with longer films, and Griffith produced the first successful full-length feature film.

Computer Generated Imagery

CGI is used in films, television programs and commercials, 
and in printed media.Video games most often use real-time 
computer graphics (rarely referred to as CGI), but may also

include pre-rendered "cut scenes" and intro movies that

would be typical CGI applications.CGI is used for visual

effects because the quality is often higher and effects are

more controllable than other more physically based

processes, such as constructing miniatures for effects shots

or hiring extras for crowd scenes, and because it allows the

creation of images that would not be feasible using any

other technology.It can also allow a single artist to produce

content without the use of actors, expensive set pieces, or

props. Recent accessibility of CGI software and increased

computer speeds has allowed individual artists and small 
companies to produce professional grade films, games, and 
fine art from their home computers.
Eg: Avatar, Tron.


Thursday, 4 July 2013

The Ongoing cyber war

The Ongoing cyber war 


The world of cyber crime is as vast as the world of cyberspace. It is hard to be controlled and to get the details of. The documentary by the CBS News on Cyber warfare talks about the cyber crimes in many strata of the society. The documentary talks about the Mafia Boy who was arrested for breaking down the giant websites like yahoo.com, cnn.com, amazon.com etc. The young generation, especially the school students find hacking as a game and shows great interest in learning the hacking software. Internet also provides hacking toolkits for free through many search engines. Also the documentary features the views and responses of various professional hackers and internet security officials on the cyber warfare. Also the documentary takes an angle from the point of view from the famous IT firms like Microsoft on the cyber warfare.
Cyber warfare also happens between the countries. Countries intrude into other country’s security system through various virus programs and steal all the important information which makes the latter country under security threat. The documentary shows Russia as the main among this category of countries. But now China is one among the most prominent hackers. The cyber war between India and China is the best example for this.
Even though there have been many questions raised on the security of internet cyber warfare continues still threatening many countries’ and individuals’ security.