AT&T has averaged over two global patents issued per business day since the inception of AT&T Labs. The goal is to continue to create value for AT&T’s customers and the company through unmatched innovation.
Tuesday, 16 July 2013
Monday, 15 July 2013
Query Processor of Google
A multi-stage query processing system and method enables multi-stage query scoring, including “snippet” generation, through incremental document reconstruction facilitated by a multi-tiered mapping scheme. At one or more stages of a multi-stage query processing system a set of relevancy scores are used to select a subset of documents for presentation as an ordered list to a user. The set of relevancy scores can be derived in part from one or more sets of relevancy scores determined in prior stages of the multi-stage query processing system. In some embodiments, the multi-stage query processing system is capable of executing one or more passes on a user query, and using information from each pass to expand the user query for use in a subsequent pass to improve the relevancy of documents in the ordered list.
Query Processing Phases
There are two major phases in query processing: query optimization and query execution.
Query optimization is the process of choosing the fastest execution plan. In the optimization phase, the query processor chooses:
- Which, if any, indexes to use.
- The order in which joins are executed.
- The order in which constraints such as WHERE clauses are applied.
- Which algorithms are likely to lead to the best performance, based on cost information derived from statistics.
Query execution is the process of executing the plan chosen during query optimization. The query execution component also determines the techniques available to the query optimizer. For example, SQL Server implements a hash join algorithm and a merge join algorithm, both of which are available to the query optimizer.
The query optimizer is the brain of a relational database system, enabling it to work intelligently and efficiently.
A relational database with a sophisticated query optimizer is more likely to complete a query, especially a complex query, faster than a relational database with a simple query optimizer.
Types of Query Optimizers
There are two major types of query optimizers in relational databases: syntax-based and cost-based.
Syntax-based Query Optimizers
A syntax-based query optimizer creates a procedural plan for obtaining the answer to an SQL query, but the particular plan it chooses is dependent on the exact syntax of the query and the order of the clauses within the query. A syntax-based query optimizer executes the same plan every time, regardless of whether the number or composition of records in the database changes over time. Unlike a cost-based query optimizer, it neither maintains nor considers statistics about the database.
Cost-based Query Optimizers
A cost-based query optimizer chooses among alternative plans to answer an SQL query. Selection is based on cost estimates for different plans. The factors in making cost estimates include the number of I/O operations, the amount of CPU time, and so on. A cost-based query optimizer estimates these costs by keeping statistics about the number and composition of records in a table or index and is not dependent on the exact syntax of the query or the order of clauses within the query (unlike a syntax-based query optimizer).
for more info visit: http://technet.microsoft.com/en-us/library/cc966472.aspx
How does a web crawler works?
The first thing you need to understand is what a Web Crawler or Spider is and how it works. A Search Engine Spider (also known as a crawler, Robot, SearchBot or simply a Bot) is a program that most search engines use to find what’s new on the Internet. Google’s web crawler is known as GoogleBot. There are many types of web spiders in use, but for now, we’re only interested in the Bot that actually “crawls” the web and collects documents to build a searchable index for the different search engines. The program starts at a website and follows every hyperlink on each page. So we can say that everything on the web will eventually be found and spidered, as the so called “spider” crawls from one website to another. Search engines may run thousands of instances of their web crawling programs simultaneously, on multiple servers. When a web crawler visits one of your pages, it loads the site’s content into a database. Once a page has been fetched, the text of your page is loaded into the search engine’s index, which is a massive database of words, and where they occur on different web pages. All of this may sound too technical for most people, but it’s important to understand the basics of how a Web Crawler works.
So there are basically three steps that are involved in the web crawling procedure. First, the search bot starts by crawling the pages of your site. Then it continues indexing the words and content of the site, and finally it visit the links (web page addresses or URLs) that are found in your site. When the spider doesn’t find a page, it will eventually be deleted from the index. However, some of the spiders will check again for a second time to verify that the page really is offline.
The first thing a spider is supposed to do when it visits your website is look for a file called “robots.txt”. This file contains instructions for the spider on which parts of the website to index, and which parts to ignore. The only way to control what a spider sees on your site is by using a robots.txt file. All spiders are supposed to follow some rules, and the major search engines do follow these rules for the most part. Fortunately, the major search engines like Google or Bing are finally working together on standards.
for more info:
How Google Works?
Google runs on a distributed network of thousands of low-cost computers and can therefore carry out fast parallel processing. Parallel processing is a method of computation in which many calculations can be performed simultaneously, significantly speeding up data processing. Google has three distinct parts:
- Googlebot, a web crawler that finds and fetches web pages.
- The indexer that sorts every word on every page and stores the resulting index of words in a huge database.
- The query processor, which compares your search query to the index and recommends the documents that it considers most relevant.
1. Googlebot, Google’s Web Crawler
Let’s take a closer look at each part.
Googlebot is Google’s web crawling robot, which finds and retrieves pages on the web and hands them off to the Google indexer. It’s easy to imagine Googlebot as a little spider scurrying across the strands of cyberspace, but in reality Googlebot doesn’t traverse the web at all. It functions much like your web browser, by sending a request to a web server for a web page, downloading the entire page, then handing it off to Google’s indexer.
Googlebot consists of many computers requesting and fetching pages much more quickly than you can with your web browser. In fact, Googlebot can request thousands of different pages simultaneously. To avoid overwhelming web servers, or crowding out requests from human users, Googlebot deliberately makes requests of each individual web server more slowly than it’s capable of doing.
2. Google’s Indexer
Googlebot gives the indexer the full text of the pages it finds. These pages are stored in Google’s index database. This index is sorted alphabetically by search term, with each index entry storing a list of documents in which the term appears and the location within the text where it occurs. This data structure allows rapid access to documents that contain user query terms.
To improve search performance, Google ignores (doesn’t index) common words called stop words (such as the, is, on, or, of, how, why, as well as certain single digits and single letters). Stop words are so common that they do little to narrow a search, and therefore they can safely be discarded. The indexer also ignores some punctuation and multiple spaces, as well as converting all letters to lowercase, to improve Google’s performance.
3. Google’s Query Processor
The query processor has several parts, including the user interface (search box), the “engine” that evaluates queries and matches them to relevant documents, and the results formatter.
PageRank is Google’s system for ranking web pages. A page with a higher PageRank is deemed more important and is more likely to be listed above a page with a lower PageRank.
Google considers over a hundred factors in computing a PageRank and determining which documents are most relevant to a query, including the popularity of the page, the position and size of the search terms within the page, and the proximity of the search terms to one another on the page. A patent application discusses other factors that Google considers when ranking a page. Visit SEOmoz.org’s report for an interpretation of the concepts and the practical applications contained in Google’s patent application.
Google also applies machine-learning techniques to improve its performance automatically by learning relationships and associations within the stored data. For example, the spelling-correcting system uses such techniques to figure out likely alternative spellings. Google closely guards the formulas it uses to calculate relevance; they’re tweaked to improve quality and performance, and to outwit the latest devious techniques used by spammers.
Indexing the full text of the web allows Google to go beyond simply matching single search terms. Google gives more priority to pages that have search terms near each other and in the same order as the query. Google can also match multi-word phrases and sentences. Since Google indexes HTML code in addition to the text on the page, users can restrict searches on the basis of where query words appear, e.g., in the title, in the URL, in the body, and in links to the page, options offered by Google’s Advanced Search Form and Using Search Operators (Advanced Operators).
for more info: http://computer.howstuffworks.com/internet/basics/google.htm
Friday, 5 July 2013
Al Jazeera : Fighting the fifth dimension an overview
Al Jazeera is a broadcasting company based on Qatar, but their packaging is mostly western. With this write-up I intend to discuss a documentary brought out by the Al Jazeera about Cyber warfare.
Using just a computer countries can now control and penetrate other countries with ease. Even without manpower and so on. Governments of some countries use social media such as Facebook to recruit people, even as spies and noone can actually control this. An understood fact is that through social media the information that comes out almost cannot be filtered. Julian Assange of Wiki Leaks is of the opinion that "Facebook was the most appalling spying machine ever invented."
Hacking kits are available easily on the internet. A group called Anonymous launched hacking attacks on MasterCard, Amazon and so on in countries like Tunisia, Algeria and Libya, using LOIC which is a hacking kit that is easily available online. Countries are increasingly understanding the importance of the Cyber world. Obama has described Cyber space as a strategic national aspect. Iran has developed a Cyber Police.
Iran in 2009 used social media like Twitter and Facebook to mobilize people for protests against the presidential election (an example of Cyber Activism).
Israel is technologically advanced, they have almost 8000 pHD holders in the field of GSM alone. This is astonishing considering the small size of their country. In 2010, Lebanese found Israeli spy devices in their territory. It was like they had extended their network to this neighboring country.
There are many examples related to this. Such as Echelon which was developed originally by the US during the Cold war. This can be used to turn any phone into a microphone using which anything that happens around that particular phone can be recorded and monitored even if the phone was switched off. Another example of cyber warfare is the monitoring and killing of Osama Bin Laden by the US.
To watch the video:
Technical Evolution in Hollywood
Technical Evolution in Hollywood
The name Hollywood:
By the 1870s an agricultural community flourished in the area and crops ranging from hay and grain to subtropical bananas and pineapples were thriving. During the 1880s, the Ranchos were subdivided In 1886, H. H. Wilcox bought an area of Rancho La Brea that his wife then christened "Hollywood." Within a few years, Wilcox had devised a grid plan for his new community, paved Prospect Avenue (now Hollywood Boulevard) for his main street and was selling large residential lots to wealthy Midwesterners looking to build homes so they could "winter in California."
Technological History:
This first phase of the motion pictures, in the late 1890s and into the 1900s, emphasized reproducing human motion. The second phase, telling a story, began to emerge around 1900. Film makers moved beyond the technical aspects of just showing motino and began to tell stories. Edwin Porter’s 1903 film, "The Great Train Robbery" is a good example of the story telling nature of films. It is the story of a robbery, with a chase scene and the inevitable capture of the robbers.
These early films were quite short, running 5 to 8 minutes long; they were called "one reelers" (they were just one reel of film). In the U.S., these films were produced by a handful of small companies just outside of New York City (Biograph, Essenay, Lubin, Pathe Brothers, Selig, Polyscope, Vitagraph, Edison and Melies).
One of the more dynamic early directors was David Wark Griffith. He worked for Biograph in New Jersey and produced literally hundreds of one-reelres in the period from 1908 to 1912. A director like Griffith might be expected to produce at least two one-reel movies a week. The names of the actors were not released, for fear they would become stars and want higher salaries.
One early Griffith film was "The Lonedale Operator," in 1911. It starred Blanche Sweet; she outsmarted the desperados. This video demonstrates some of Griffith’s innovative techniques, including cross cutting (cutting from one scene to another scene, and then back and forth, to develop various parts of a story and to build suspense) and closeups. Some early movie company owners objected to closeups, arguing that paying movie viewers would want to see the ENTIRE person. Closeups, however, could bring drama.
Griffith and others in the industry wanted to move beyond the simple formula that characterized the industry in the early 1900s. But industry owners were resistant, wanting to keep to one-reelers and limited story telling. These owners monopolized the industry, thorugh patents on key machinery and cameras and through control over distribution.
Consequently, the dissidents left the East completely and moved about as far away as they could get -- to Los Angeles. Well, to a rural area near Los Angeles -- where there the weather was good (lots of sunshine, little rain, so ideal for outside movie work) and plentiful barns (on farms) for inside work. This was Beverly Hills. In Hollywood, Griffith and others began to experiment with longer films, and Griffith produced the first successful full-length feature film.
Computer Generated Imagery
CGI is used in films, television programs and commercials,
and in printed media.Video games most often use real-time
computer graphics (rarely referred to as CGI), but may also
include pre-rendered "cut scenes" and intro movies that
would be typical CGI applications.CGI is used for visual
effects because the quality is often higher and effects are
more controllable than other more physically based
processes, such as constructing miniatures for effects shots
or hiring extras for crowd scenes, and because it allows the
creation of images that would not be feasible using any
other technology.It can also allow a single artist to produce
content without the use of actors, expensive set pieces, or
props. Recent accessibility of CGI software and increased
computer speeds has allowed individual artists and small
companies to produce professional grade films, games, and
fine art from their home computers.
Eg: Avatar, Tron.
Thursday, 4 July 2013
The Ongoing cyber war
The Ongoing cyber war
The world of cyber crime is as vast as the world of cyberspace. It is hard to be controlled and to get the details of. The documentary by the CBS News on Cyber warfare talks about the cyber crimes in many strata of the society. The documentary talks about the Mafia Boy who was arrested for breaking down the giant websites like yahoo.com, cnn.com, amazon.com etc. The young generation, especially the school students find hacking as a game and shows great interest in learning the hacking software. Internet also provides hacking toolkits for free through many search engines. Also the documentary features the views and responses of various professional hackers and internet security officials on the cyber warfare. Also the documentary takes an angle from the point of view from the famous IT firms like Microsoft on the cyber warfare.
Cyber warfare also happens between the countries. Countries intrude into other country’s security system through various virus programs and steal all the important information which makes the latter country under security threat. The documentary shows Russia as the main among this category of countries. But now China is one among the most prominent hackers. The cyber war between India and China is the best example for this.
Even though there have been many questions raised on the security of internet cyber warfare continues still threatening many countries’ and individuals’ security.
Friday, 28 June 2013
Digital Media And Society
Presentations in Class
June 17 Monday
Pew report: digital nonprofits optimistic
The article discuss about the Pew Report that informs the public about the issues, trends and attitudes shaping America and the world. It talks about non profit journalism which is also called Think Thank Journalism.
When journalism is practised as a non profit business, it can be termed as non profit journalism.It is normally operated for the goodwill of the people without the concern for making profit. The article talks about the Pew report which looked at 172 news organisations throughout the country. It discuss about the Internal revenue Service(IRS), which is responsible for collecting taxes and interpretation and enforcement of the internal revenue Code. Lets have a look at the relation between IRS and Non profit journalism.
Any individual or team aiming to start a Non Profit Journalism practise must provide various required documents to the IRS.The IRS will identify if the individual or the interested theme have a specific goal which will be aimed to increase the good will of the society or not.
The advantages of Non profit Journalism are given as: Excemption from paying taxes, Can accept donations etc.
For more info please refer to:
http://www.cjr.org/behind_the_news/pew_report_on_digital_news_non.php
http://www.cjr.org/behind_the_news/pew_report_on_digital_news_non.php
Fair Game
When it comes to the case of ethical cyber journalism , it is important that the laws must be in place. The article talks about the need for fair journalism. It is quite easy to paraphrase someone`s words or paragraph or photographs and publish it. So the Centre for social media has produced new policies for the Fair journalism, which is the tenth such consensus document. the new principles are seven in number:
•Fair use applies to the incidental and fortuitous capture of copyright material in journalism
•Fair use applies when journalists use copyrighted material as documentation, to validate, prove, support, or document a proposition.
•The use of textual, visual and other quotations of cultural material for purposes of reporting, criticism, commentary, or discussion constitutes fair use.
•Fair use applies to illustration in news reporting.
•Fair use applies to journalistic incorporation of historical material.
•The use of copyrighted material to promote public discussion and analysis can qualify as fair use.
•Fair use can apply to the quotation of earlier journalism.
. These are policies that have been drafted after a lot of debate about what kind of documentation, comments, photographs one can use/borrow/cite ethically within the parameters of creating new content. The new content can be based on many things along with opinions and reviews of existing material on the internet. Policies like these are best implemented and registered as laws to safeguard the interest of the people who invest their time and energy to add value to the vast content database
Social Media in smaller markets
The article on Social media in smaller markets talks about the social media editors. It discusses the social media editing and its features which has changed with time. Lets have a look at what exactly a Social editor means.Social Media Editor handles manually tweeting for his newspaper ,setting up Twitter accounts for reporters who wish to tweet and teaching those that are reluctant to jump, handling all things Facebook, and coordinating all of the blogging that the newspaper intends to do both internally and externally.n The role was created in recognition of the growing importance of social networking and the huge part it played in people’s
lives. Rob Fishman’s who is the social media editor at the Huffington Post announcement that “the social
media editor is dead,” prompted plenty of responses, from Adweek to Zombie Journalism and many social media editors and digital media strategists in between.The article by Sara Morrison in the Columbia Journalism Review talks about the ‘social media editor’. The article talks about Anthony De Rosa, Liz Heron and many other famous persons in the media field who transformed themselves into ‘social media editors’. The announcement made by Rob Fishman that the social media editors are dead shows its flux, that the position may not be existing in the near future. Most of the above mentioned disagreed this announcement and talked about the huge need for social media editors in the field. The writer says that most of the local and small media firms are essentially in need of social media editors for them and gives a high scope of job security and profit. The article also shows the interview with the three social media editors of small outlets named Hartford Courant, St. Paul Pioneer Press and The Topeka Capital Journal. The interview shows that the three of them are highly successful in their jobs and has more than 10000 likes in facebook and more than 20000 followers in twitter which is a huge number when being compared with the circulation of these dailies and being a small firm. These statistics shows how powerful is the post of social media editors.
Creation of a Website
Key points while creating a website
Remember that:
- text attracts attention before graphics
- we don't read pages, we scan them
- we don't make optimal choices, we satisfice
- we don't figure out how things work, we muddle through
How good is a website?
1. The audience should not be made to think, navigation should be very easy.
2. WOW factor: attractiveness
3. White space: increases clarity
4. Navigation should answer - Where am I? Where should I go from here?
5. Typography: should be in sync
6. Grid: text and images should be aligned in an optimum manner
7. Colors used should be co-ordinated
8. Consistency: from end to end there should be consistency
Eye Movement:
The eye scans the page in an F pattern.
Also banners are mostly ignored on web pages.
Fancy formatting and words may also be ignored - misunderstood as advertisements.
The short story of the Internet
Internet is the most popular media among the urban masses around the world. Understanding of the history of internet and its entry into different other fields will give the readers an insight about its successful journey. Associated with the journey of internet in media field, the invention of electronic mail had a great role. 1971 marked an important day in which the first email was sent by the computer engineer Ray Tomlinson. Later the first networking protocols TCP/IP were formulated. The concept of world-wide network was brought about in the year 1982 and the internet took its birth in its full shape on January 1, 1983. Later in 1991 Tim Burners Lee came up with the world wide web (www) with hypertext transfer protocol as the foundation for the data communication of world wide web.
Web 1.0 is the first stage of the world wide web linking the hyperlinks. Later Web 2.0 came which allowed the users to interact and involve with each other through the platform of social media as the creators of user generated content in a virtual community. Thus the new era of social networking sites became a reality. Following this the new media came as the strongest media where users were the creators and members involved in the media.
For further reading, refer:
Subscribe to:
Posts (Atom)