Wednesday, March 6, 2019
Data Mining and Data Warehouse Essay
ABSTRACT info   excavation, the  decline of hidden  bidive information from large  informationbases, is a powerful  impudently technology with great potential to help companies focus on the  just ab verboten  historic information in their  information wargon houses. Data  archeological site tools predict future  burns and behaviors, allowing  furrowes to make proactive, knowledge- driven  finales systems. Data  store is a computer system designed to give  fear  decision-makers instant  approaching to information. The w  behouse copies its  information from existing systems like order entry, general ledger, and  forgiving re solutions and stores it for use by executives rather than programmers. Data warehouse users use  extra software that enables them to create and access information when they need it, as  unconnected to a reporting schedule defined by the information systems (IS) department. This  musical theme describes the  core of  info warehouse and  entropy  tap basic computer    architecture of selective information wareho use and  entropy mining, functions and working of  information mining. It  likewise presents  info mining from  information warehouseINTRODUCTIONModern organizations are  below enormous pressure with  new-made development of the technology. Clearly we need a rapid access to all kinds of information. To assist this we need to consider the  a agency and to identify relevant trend analysis. So to perform any trend analysis we must  get to a selective informationbase. In  about organizations you  go away find really large  informationbases in operation for normal  everyday  feats. These  fibers of  infobases are known as  useable databases in  roughly cases they have not been design to store historical data or to respond to queries  and simply to  halt all the  screenings for day to day transactions.The  blink of an eye type of database found in organizations is the data warehouse. This is designed for strategic decision support and is largel   y built up from the databases that make up the operational database. The basic characteristic of a data warehouse is that it contains vast  heart of data which  undersurface mean billions of records. Smaller, local data warehouse are called data marts. A data warehouse is designed especially for decision support queries  thence  except data that is needed for decision support is extracted from the operational data and stored in the data warehouse along with the  period when it was retrieved from operational databases.DEFINITIONDATA WAREHOUSINGA data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of managements decision  do process. Subject-Oriented A data warehouse  tooshie be use to  take a particular subject area. For  cause, sales  cigaret be a particular subject. Integrated A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but    in a data warehouse, there will be only a  champion way of identifying a product.Time-Variant Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept. For example, a transaction system may hold the most recent address of a client, where a data warehouse can hold all addresses associated with a client. Non-volatile Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered.The  undermentioned are the typical  travel involved in the data warehousing project cycle.* Requirement  accumulation* Physical Environment Setup* Data Modeling* ETL* OLAP Cube  cast* Front End Development* Report Development* Performance  set* Query Optimization* Quality Assurance* Rolling out to  occupation* Production Maintenance* Incremental EnhancementsBenefits    of a data warehouseA data warehouse maintains a copy of information from the source transaction systems. This architectural  confusedity provides the opportunity to * Maintain data history, even if the source transaction systems do not. * Integrate data from multiple source systems,  modify a central view across the enterprise. This benefit is always valuable, but particularly so when the organization has grown by merger. * Improve data quality, by providing consistent codes and descriptions, flagging or even fixing  mischievous data. *  stand for the organizations information consistently.* Provide a single common data model for all data of interest  unheeding of the datas source. * Restructure the data so that it makes sense to the  stemma users. * Restructure the data so that it delivers excellent query performance, even for complex analytic queries, without impacting the operational systems. * Add value to operational business  coats, notably  guest  dealingship management (CRM)    systems.Data  excavation (DM)Data mining, also known as knowledge  baring, refers to computer-assisted tools and techniques for sifting  by means of and analyzing these vast data stores in order to find trends, patterns, and correlations that can guide decision making and increase understanding. Data mining covers a wide variety of uses, from analyzing customer purchases to discovering galaxies.In essence, data mining is the equivalent of  purpose gold nuggets in a mountain of data. The monumental task of  conclusion hidden gold depends  firmly upon the power of computers The purpose of DM is to analyze and understand past trends and predict future trends.By predicting future trends, business organizations can better  stupefy their products and services for financial gain. Nonprofit organizations have also achieved significant benefits from data mining,  such as in the area of scientific progress. The concept of data mining is simple yet powerful. The simplicity of the concept is d   eceiving, however. Traditional methods of analyzing data, involving query-and-report approaches, cannot  cargo area tasks of such magnitude and complexity. Data mining consists of five major elements* Extract, transform, and  clog transaction data onto the data warehouse system. * Store and manage the data in a multidimensional database system. * Provide data access to business analysts and information technology professionals. * Analyze the data by application software.* Present the data in a  serviceable format, such as a graph or table.Data mining services can be used for the following functions * Research and surveys Data mining can be used for product research, surveys, market research and analysis. Information can be gathered that is quite useful in driving new  market campaigns and promotions. * Information collection Through the web scraping process it is  realizable to collect information regarding investors, investments and funds by scraping  with  cogitate websites and da   tabases. * Customer opinions Customer views and suggestions play an important role in the way a company  molds. The information can be readily be found on forums, blogs and other resources where customers freely provide their views. * Data  see Data collected and stored will be not be important unless scanned. Scanning is important to identify patterns and similarities contained in the data.* Extraction of information This is the  bear upon of identifying the useful patterns in data that can be used in decision making process. This is so because decision making must be  found on sound information and facts. * Pre-processing of data Usually the data collected is stored in the data warehouse. This data needs to be pre-processed.by pre-processing it  path some data that may be deemed unimportant may therefore re removed manually be data mining experts.*  mesh data  entanglement data usually poses many challenges in mining. This is so because of its nature. For instance, web data can be    deemed as dynamic meaning it keeps changing from time to time.  on that pointfore it means the process of data mining should be repeated in regular intervals. * Competitor analysis There is a need to understand how your competitors are fairing on in the business market. You need to know both their weaknesses and strengths. Their methods of marketing and distribution can be mined. How they reduce their overall costs is also quite important.* Online research The  lucre is highly regarded for its huge information. It is evident that it is the largest source of information. It is possible to gather a lot of information regarding different companies, customers and your business clients. It is possible to detect  twaddles through online means. * News Nowadays with almost all major newspapers and news sources  bank note their news online it is possible to gather information regarding trends and other critical areas. In this way, it is possible to be in the better position of competing in    the market. *  modify data This is quite important. Data collected will be  trivial unless it is updated. This is to ensure that the information is relevant so as to make decisions from it.How does data mining work?While large-scale information technology has been evolving  discipline transaction and analytical systems, data mining provides the link between the two. Data mining software analyzes relationships and patterns in stored transaction data based on open-ended user queries. Several types of analytical software are  acquirable statistical, machine learning, and neural  net incomes. Generally, any of four types of relationships are sought * Classes Stored data is used to locate data in predetermined groups. For example, a eatery chain could mine customer purchase data to determine when customers  find and what they typically order.This information could be used to increase traffic by having daily specials. * Clusters Data items are grouped according to logical relationships or    consumerpreferences. For example, data can be mined to identify market segments or consumer affinities. * Associations Data can be mined to identify associations. The beer-diaper example is an example of associative mining. * Sequential patterns Data is mined to anticipate behavior patterns and trends. For example, an  alfresco equipment retailer could predict the likelihood of a backpack being purchased based on a consumers purchase of sleeping bags and hiking shoes.Industries/fields where data mining is  accreditedly applied are as follows 1. Data  archeological site in the Banking Sector Worldwide, banking sector is ahead of many other industries in using mining techniques for their vast customer database. Although banks have employed statistical analysis tools with some success for several years, previously unseen patterns of customer behavior are now coming into clear focus with the  aid of new data mining tools. These statistical tools and even the OLAP find out the answers,    but more advanced data mining tools provide  cortical potential to the answer. Some of the applications of data mining in this industry are (i)Predict customer reaction to the change of interest rates (ii)Identify customers who will be most receptive to new product offers (iii)Identify loyal customers(iv) Pin point which clients are at the highest risk for defaulting on a loan (v)Find out persons or groups who will opt for each type of loan in the following year (vi)Detect fraudulent activities in credit card transactions (vii)Predict clients who are likely to change their credit card affiliation in the  future(a) quarter (viii)Determine customer preference of the different modes of transaction namely through teller or through credit cards, etc.2. Data  excavation in the  damages SectorInsurance companies can benefit from  modern data mining methodologies, which help companies to reduce costs, increase profits, retain current customers, acquire new customers, and develop new product   s .This can be  do through (1)Evaluating the risk of the  assets being insured taking into account the characteristics of the asset as well as the owner of the asset. (2)Formulating Statistical Modeling of Insurance Risks(3)Using the Joint Poisson/Log-Normal Model of mining to optimize insurance policies (4)And finally finding the actuarial Credibility of the risk groups among insurers3. Data Mining in telecommunicationAs on this date, every activity in telecommunication has used data mining technique.(1)Analysis of telecom service purchases(2)Prediction of telephone calling patterns(3)Management of resources and  net traffic(4)Automation of network management and maintenance using artificial  experience to diagnose and repair network transmission problems, etc4. Data Mining in Fraud DetectionData dredging has found wide and useful application in various fraud  detecting processes like (1)Credit card fraud  perception using a combined parallel approach (2)Fraud detection in the vote   rs list using neural networks in combination with  emblematical and analog data mining. (3)Fraud detection in passport applications by  invention a specific online learning diagnostic system. (4)Rule and analog based detection of false medical claims and so on.An Architecture for Data MiningTo  trump apply these advanced techniques, they must be fully integrated with a data warehouse as well as flexible  synergistic business analysis tools. Many data mining tools currently operate outside of the warehouse, requiring extra steps for extracting, importing, and analyzing the data. Furthermore, when new insights require operational implementation, integration with the warehouse simplifies the application of results from data mining. The resulting analytic data warehouse can be applied to improve business processes throughout the organization, in areas such as promotional campaign management, fraud detection, new product rollout, and so on. Figure 1 illustrates an architecture for advanc   ed analysis in a large data warehouse.Figure 2  Integrated Data Mining ArchitectureFROM DATA WAREHOUSE TO DATA MININGDM is a set of methods for data analysis, created with the aim to find out specific dependence, relations and rules related to data and making them out in the new, higher-level quality information. As distinguished from the data warehouse, which has unique data approach, DM gives results that show relations and interdependence of data. Mentioned dependences are mostly based on various  numeral and statistic relations.Figure 3 Process of knowledge data discoveryEMERGING TRENDS IN DATA MINING entanglement mining  is the application of data mining techniques to discover patterns from the  weathervane. According to analysis targets, web mining can be divided into three different types, which are Web usage mining, Web content mining and Web structure mining. Web usage miningWeb usage mining is the process of extracting useful information from server logs i.e. users history   . Web usage mining is the process of finding out what users are looking for on Internet. Some users might be looking at only textual data, whereas some others might be interested in multimedia data. Web structure miningWeb structure mining is the process of using graph theory to analyze the node and connection structure of a web site. According to the type of web  structural data, web structure mining can be divided into two kinds 1. Extracting patterns from hyperlinks in the web a hyperlink is a structural component that connects the web  summon to a different location. 2. Mining the  document structure analysis of the tree-like structure of page structures to describe HTML or XML tag usage. Web content miningWeb content mining is the mining, extraction and integration of useful data, information and knowledge from Web page contents. Data Stream Mining is the process of extracting knowledge structures from continuous, rapid data records. A data stream is an ordered sequence of inst   ances that in many applications of data stream mining can be read only once or a small number of times using limited computing and storage capabilities. Examples of data streams include computer network traffic, phone conversations, ATM transactions, web searches, and sensor data.  
Subscribe to:
Post Comments (Atom)
 
 
No comments:
Post a Comment