Thursday, 17 August 2017

BIG DATA: NOSQL TABLE SCHEMA + ORACLE NOSQL DATABASE


Source:https://stackoverflow.com/questions/11378457/nosql-table-schema



NOSQL Table Schema

I'm trying to plan a NOSQL table schema. There are relationships in my data, but they are mostly what would be N:N in a relational db; there are very few normal 1:N relationships.
So in this case, I'm trying to create implicit relationships that will allow me to browse from both ends of the relationship. I'm using Azure Table Storage, so I understand that full-text searching isn't available; I can only retrieve an "object" by its Partition Key + Row Key combination.
So imagine I have a table called "People" and a table called "Hamburgers" and each object in the tables can be related to multiple objects in the other table. Hamburgers are eaten by many people, people each eat many hamburgers.
Since the relationship is probably weighted to the people side - i.e. there are more people per hamburger than vice-versa, I would handle this in the tables like this:
Hamburger Table
Partition Key: Only 1 partition
Row Key: Unique ID
People Table
Partition Key: Only 1 partition
Row Key: Unique ID
"Columns": an extra value for every hamburger the person eats
Hamburger-People Table
Partition Key: Hamburger Row Key
Row Key: People Row Key
This way, if I'm looking at a hamburger and want to see all the people that eat it, I can go to the Hamburger-People table and use my Hamburger's Row Key to get the partition of all the people that eat the hamburger.
If I'm at a person and want to see all the hamburgers he/she eats, I have the extra values with the Row Keys of the hamburgers the person eats.
When inserting data into the tables, if the data involves a hamburger/person relationship, I would insert both values in the proper tables, then create the Hamburger-People table. If I was trying to keep a duplicate-free list of hamburgers, I would need to search the Hamburger table first to make sure the hamburger wasn't already in there (like "Whopper" - if it's in there, I wouldn't insert it again). Then, I would need to go insert a row in the hamburger's existing partition in Hamburger-People table.
But for the most part, the no-duplicate requirement doesn't exist.
Is this a good best-practices approach to NOSQL schema, or am I going to run into problems later?
UPDATE Also, I would like to be able to partition the data tables later, but I'm not sure how to do so with this structure; adding a 2nd partition to the hamburger table would require me to store an extra value in the hamburger-People table, and I'm not sure if that would start to be too complex.
shareimprove this question


Ok, nice questions and I think most of them are the ones each RDMBS developer face as soon as hits NoSQL world:
1. How to group the partitions? To get the best of the partitions you need to think that the load of your database should be distributed across your servers, lets see what will happend with your approach
A person with Key "A" enters to the restaurant you will save it and his burger, which is a Classic Tasty (Key "T") the person record goes to the server X and the Burger goes to server Y, now a new customer goes enters with the Key "B", and wants something different, a burger "W", again the person goes to server X and the burguer to server X, this time the server X is getting all the load, if you repeat this you'll see that the server X becomes a bottle neck, because 75% of the records are going there (all the people and 50% of the burgers), that will create some problems with your load. But... the problem will be better when you try to query because all the queries will hit the server X. To solve this you could use the key of the person as part of the partition for the relationship, so the person will be partitioned in the same server of the burguers relationship, this way your workload will be balanced and you wont have any problems if one of the servers goes down (the person and hamburguers will be "lost" together), this will be a consistence "inconsistency"
2. Should I use a "relationship" in a NoSQL database? Remember that NoSQL means that you are granted to duplicate information anytime your problem requires a solution to avoid "overqueries", so, if you can store the information that will be commonly queried together you will avoid a roundtrip to the database. So, if you store a "transaction" instead of "person and burguers" you will get a better performance and avoid some hits to the database, lets do an example of real data with your approach and compare it with "my" approach:
  1. Joe Black comes to the restaurant and ask for a tasty, here you will do the following transactions: Create a Joe Black record Create a Burguer transaction record
if you want to list your daily transactions you will need to:
Get all the records from the day in the "table" person-burguer, then go to the person "table" and retrieve the name of the customers and now, go to the hamburguer records and retrieve their names. (you wont be able to do cross-table queries because some records could be in one server and others in the second server)
Ok, what if you create a table "transactions" and store in there the following json:
{ custid: "AAABCCC", name: "Joe", lastName: "Black", date: "2012/07/07", order: { code: "Burger0001", name: "Tasty", price: 3.5 } }
I know you will have several records with the same "tasty" description, that's desnormalization which is very useful when you approach NoSQL solutions to these type of problems, now, how many transactions did you create to store the information to the database? just one! wow... and how many queries will you need to retrieve the information at the end of the day? again... just one, it will create some problems, but will save you a lot of work too, like... could you reprint the order easily? (yes you can!) what if the name of the customer changes? is that even possible?
I hope this help you some way,
I'm the creator of http://djondb.com so I think that having inside knowledge gives me a different approach to the problems according to what the database will be able to do, but I'm not aware of how azure will handle the queries if you are not able to query the document values and just the row keys, but anyway I hope this gives you an insight.
shareimprove this answer
  
Thanks Cross, good information. I have 2 questions: (1)This application is a social app, so it's less about what burgers did this person order than keeping a list of their favorite burgers. Since the main purpose is to keep track of the relationships, I'm not sure the transaction record would help - would it? Maybe I'm missing some of the benefits. – Andrew B Schultz Jul 10 '12 at 15:22
  
(2) The issue of servers - with Azure, I don't have control or knowledge of the server where a table is strored (I don't think). That's abstracted, so I'm not sure thinking about that matters. Unless you're saying that by making sure the person key is part of the Hamburger-Person entity's partition key I would somehow ensure that the tables stay on the same server. But I don't see how that could be possible, because the table won't know that a string I'm inserting in a partition key is part of a row key elsewhere ... – Andrew B Schultz Jul 10 '12 at 15:26
  
@AndrewBSchultz if you want to keep track of their favorite burgers the solution will be the same, store the "name" of the burger instead of its code, that way you wont need to do an extra query to get the name of their favorite burgers. something like: { "mb": "Tasty" } instead of { "mb": "0001001" }, with the code you will need an extra query to get the description of the product. – Cross Jul 11 '12 at 15:57
  
@AndrewBSchultz I'll need to understand more about how microsoft azure handles the partitions, but what you need to bare in mind is that your data model needs to address the problem of having part of the data in one server and part in other, that will mean hitting 2 servers and one of them could be bottle neck in some point of time. If you create a data model that ensures you will get all the information you need in one step, then you solved any kind of issues with the partition keys or any other thing, meaning you need an approach like the one I stated before where you will save all in 1 doc – Cross Jul 11 '12 at 16:01

ORACLE-NOSQL-DATABASE http://www.oracle.com/technetwork/products/nosqldb/overview/nosqlandsqltoo-2041272.pdf

NoSQL and SQL Introspective Oracle NoSQL Database 11g Release 2 (11.2.1.2) Oracle White Paper October 2013 Oracle NoSQL Database Oracle White Paper— NoSQL and SQL 


Introspective: Oracle NoSQL Database 11g Release 2 

Introduction ..................................................................... 2 
NoSQL – purpose-built data management...................... 2 
NoSQL Application Example – simple data, simple queries...2 Customer Profile Management ............................................. 2 RDBMS vs NoSQL – The right tool for the right job ............. 4 Simple Data, Simple Queries................................................ 5 Simple Joins ......................................................................... 6 Complex Queries.................................................................. 7 
What is NoSQL?................................................................... 8 NoSQL systems................................................................... 9 
Oracle NoSQL Database – Value Proposition ................... 10 
Other NoSQL databases .................................................... 11 NoSQL Database and RBBMS Database........................... 12 
Where do NoSQL Database projects get started? ............. 12 Conclusion ........................................................................... 1 

Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 2 

Introduction NoSQL – purpose-built data management 
NoSQL systems are purpose-built solutions, designed to address specific technical requirements. NoSQL systems originated to provide high throughput, fault-tolerant horizontally scalable simple data storage and retrieval with a bare minimum of additional functionality. Specifically, NoSQL systems were created in order to provide: 

 Horizontally distributed data management of simple structured and unstructured data across a large cluster of commodity storage systems, 
 Highly fault-tolerant data management and ability to continue operating even after multiple hardware and system failures, 
 Very high throughput for simple read/write operations, with limited or no transaction semantics, 
 Schema-less or flexible schema definitions allowing highly variable data and record structures, 
 Application and application developer-centric special purpose data models and APIs. 

On the other hand, RDBMS systems, like Oracle Database, are designed to provide general purpose data management capabilities and standard APIs for a very wide variety of requirements. Hence, they incorporate a lot of features and functionality. 

Not all applications require the full set of RDBMS functionality. If the application doesn’t require all of the functionality in a typical RDBMS, why would the customer pay for the hundreds of features that they don’t need? For example, web-centric customer service, loyalty card programs and customer profile management applications primarily require fast, scalable key-based access to data. In such scenarios, a cost-effective, purpose-built NoSQL database is an attractive alternative to a relational system. Enterprises, ISVs and SIs are actively identifying applications and data management processes which can be implemented and managed more effectively using special-purpose NoSQL systems. 

NoSQL Application Example – simple data, simple queries Customer Profile Management Let’s look at a specific example of an application that matches the capabilities of a NoSQL system – customer profile management. 

In the past, customer profiles were typically financial transaction-oriented data structures/repositories. Today, a customer profile includes a much richer data set that includes information from a Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 3 wide variety of customer interaction points that include both structured and unstructured data. Capturing and managing this new class of data is crucial to the enterprise, in order to more effectively obtain a 360 degree view of a customer and to optimize customer interactions. This information enables a broad category of Line-of-Business applications from Marketing, Advertising, Customer Service, Risk Analysis, Fraud Detection, Personalized content, Promotional Campaigns, Loyalty Programs, Inventory Management, etc. 

These Line-of-Business applications leverage the richer user-profile data in order to provide 

a. More personalized customer experiences via targeted product offerings, special promotions, loyalty rewards, more informed/context sensitive interactions, etc., 
b. Better operational insight into how their customers interact with them, how they perceive the company and its products and services, with a longer and more complete historical perspective, c. Better competitive insight into how customers perceive their competition, 
d. Better operational decision making, based on a more detailed understanding of their customers. Modern customer profile management applications benefit directly from the capabilities of NoSQL systems because they provide
a. A horizontally scalable, distributed data management system to manage the large volume of data that is part of a rich customer profile and that can grow with little to no maintenance as the customer base grows,
 b. A highly fault tolerant system, ensuring that the customer profile data is always available to the applications that need to access and update it,
 c. A flexible or no-schema data store, which facilitates a wide variety of data record formats to be stored in a given customer profile and for those record formats to change over time, 
d. Specialized, high throughput application-specific read and write access to the portions of the customer profile that are important for that particular application1 . 

1 It is important to note that these applications often require support for transactions and concurrency. Not all NoSQL databases provide this functionality. Oracle NoSQL Database does. For more information see section on Oracle NoSQL Database on page 9. Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 4 A great example of this is our recent work with well known Entertainment company on their Next Generation Experience (NGE) application. This company wants to create a centralized repository for each of its customer’s experience across all of its entertainment properties (theme parks, restaurants, cruises, merchandizing outlets, etc.) over a span of decades. The value of this solution is that a customer’s visits as a child, then as a parent, then as a grandparent are all available in the system, regardless of which (other) system the interaction originated from. They want the customer profile repository to contain both structured and unstructured data, with a schemaflexible format so that it can evolve over time. Each customer’s profile will include URL-style pointers to records of activities at the various properties, as well as photos, privileges, loyalty program links, associated groups, other related customers, customer feedback, etc. For this company the concept of “a customer profile” includes many different sources and types of data, over an extended period of time, resulting in a very large and varied set of data – centralized and served from one data management infrastructure. This customer profile will be used to drive existing and future applications that provide the customer with a personalized, enhanced experience, utilizing both real-time and Business Intelligence Advanced Analytics access to the customer profile. Originally this centralized repository was designed to be managed in a relational database. However, last year they decided to re-design the NGE repository to use a NoSQL database. They made this decision because a. the NGE customer profile repository primarily required simple queries, simple data, flexible schemas and horizontally distributed storage, which was more efficiently addressed using a special purpose NoSQL database, b. of specific application developer technical synergies with NoSQL technology. In particular, their applications didn’t manage data using SQL nor did it represent the data as SQL tables. From the application point of view, records were just JSON objects consisting of simple, but variable data elements and pointers to other records. Their natural application-developer-centric technical inclination was to eschew using SQL and focus on a simpler, more special purpose database and API that was closer to their application’s view of the data. They are now also looking at a NoSQL Database. This is great example of a common use case for a NoSQL Database. RDBMS vs NoSQL – The right tool for the right job Sophisticated RDBMS systems, for example the Oracle Database, encompass a very rich set of features and functionality, primarily focused on general purpose OLTP and Data Warehousing use by many different types of applications. NoSQL systems, on the other hand, encompass a very limited set of features and functionality while Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 5 providing horizontal scalability, availability and data modeling flexibility for specific applications that manage simple data and simple queries. NoSQL systems typically include a set of software packages running across dozens, hundreds or even thousands of smaller systems. Efficient horizontal scalability, high availability and concurrent data processing are provided out of the box, but the integration effort to weave the software components into an integrated solution is left up to the customer. Many of the advanced features that are common with mature RDBMS systems are not present in NoSQL systems. This requires NoSQL technology users to integrate their NoSQL data with the RDBMS systems in order to make use of the advanced features that they need. NoSQL systems can do lots of specific-purpose, simple operations extremely quickly, but are not designed with the features to perform complex, general purpose operations in a integrated way. Let’s look at an example of how this plays out in terms of understanding what types of application characteristics are best suited for NoSQL and which ones are not. Simple Data, Simple Queries Let’s take the example of customer profile management discussed earlier. The functionality that is required is very simple – the application needs to read and write a few records based on the primary key -- customer ID. These records combine structured and unstructured data, stored in a way (often de-normalized) that allows the application to only operate on the subset of data that it needs for that specific type of transaction. Customer profile management applications typically have to perform an extremely high number of these simple operations of lookup based on a customer ID with minimal latency. The latency component is critical because these applications often implement the “last mile” interface with the customer. Customers and other downstream systems interact directly with the content managed in the NoSQL Database, and real-world studies have shown that any increase in latency can be linked directly to reduced revenue and loss of business.2 2 Amazon.com conducted an extensive study that demonstrated a direct relationship between increased latency and loss of revenue. For every 100ms (that’s 1/10th of a second) of increased latency on their web site, they observed a decrease of 1% in revenue. In other words, an almost imperceptible Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 6 These “last mile” applications, based on rich customer profile management, are a perfect fit for NoSQL databases. An RDBMS can provide the same level of functionality and throughput, but they may not fit the technical requirements as closely or as efficiently as a special purpose NoSQL database . This example can be extended to sensor data (machine profiles, rather than customer profiles), loyalty card programs, financial data, product data, etc. The bottom line is that for these kind of operations (simple queries over simple data), NoSQL databases can be more efficient than RDBMS systems. Simple Joins Let’s imagine a slightly more complex requirement, where the customer profile application needs to perform multiple lookups in a small set of related tables (joins). For example, a customer profile is likely to include a list of products (stored as product IDs) that the customer has “liked” or “disliked” in the past. In this case, it’s relatively easy to express the SQL JOIN queries in an RDBMS (joining the Customer Profile table with Product table) and have the query optimizer do the right thing to optimize and execute the joins. In a NoSQL database, it’s up to the application (and the application developer) to implement the appropriate primary and foreign key lookups in the most efficient order as separate operations in order to satisfy the query. In a NoSQL database, the application logic would have to contain code to perform a lookup of the Product table for each product ID in the customer profile. Both the RDBMS and the NoSQL database could be used support this kind of application requirement – the RDBMS is easier to implement, while the NoSQL database may perform the simple queries more efficiently but some of the functionality would have to be implemented within the application logic (the application developer has to implement the joins using client-side code to perform the appropriate lookups). So, why wouldn’t a developer choose the “easier” option of using an RDBMS for this kind of application? Primarily for three reasons: increase in latency translated directly into lost business because customers stopped browsing or using the site. Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 7 a. Simple joins are basically just incremental key lookups, sometimes described as client-side joins. These simple joins are easily implemented in application code b. When scalability and performance are the key requirement, simple special-purpose data stores like a NoSQL Database offer an option that is better performing at a lower cost of operation, c. Over the past decade application development has changed. Even within established RDBMS shops and especially in new/innovative technology companies, the development of web-scale applications and new customer-oriented services has moved towards Java-based application development. Developers are much more accustomed to building data management and access to specialized, high performance, high scalability databases such as Key-Value based NoSQL databases in order to leverage the advantages that these products provide, even if it requires some additional application development effort. It’s a “religious” discussion! Complex Queries Finally, let’s imagine that the application requires more than simple queries or simple joins. The application may need to perform complex, multi-table joins3 , to perform complex analytics; it may require fixed schemas in order to enforce referential integrity or other business rules; it may require heightened data security, data lifecycle management or other advanced features that are commonly found in an RDBMS. In this case, the clear choice is to use an RDBMS-based solution. A NoSQL technology based solution would require significant functionality that would have to be 3 An example of a complex multi-table join: In an effort to reduce inventory overhead, a retail manager wants to identify low margin items that are not selling well in the northwest region. They might ask “Give me the Supplier name, Item Number and Item Description from the Supplier and Inventory tables where the Price/Cost is less than 10% and the QuantityOnHand is greater than 20 and the total sales for the last 6 months in the NW region is less than $1000 from the Sales and Inventory tables.” This is relatively easy to express in SQL and let the SQL optimizer and query planner figure out how to get the data. But in a NoSQL application it would require writing multiple queries and application code to collect and process the results of those queries in order to produce the same result. Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 8 implemented in the application. In fact, this is the primary argument in favor of staying with an RDBMS, rather than switching to a NoSQL database. If the application needs the rich functionality of the an RDBMS, don’t try moving to a platform that doesn’t have (nor will it ever have) that functionality. The above characterization of use cases reflects what we’re seeing in our customer base – complex application functionality continues to be implemented using the Oracle Database. However, in application scenarios where simple functionality, horizontal scalability and schema flexibility are the primary requirements, these applications are being implemented in NoSQL databases. The fact is that a full customer solution typically includes both types of operations, and therefore both types of databases. Customers are actively engaged in identifying the right database management tool for the right job. They are undertaking the necessary steps to adopt and integrate NoSQL databases as part of an overall solution, where that tool makes sense. When talking with customers it’s critical to recognize the technical challenges that are being addressed, that one size does not fit all and how/when to use What is NoSQL? NoSQL (commonly interpreted as “not only SQL”) is a broad class of database management systems developed over the last decade that is primarily identified as not adhering to the widely used relational database management system model. Although it is inherently difficult to define a technology based on what it isn’t, and although NoSQL systems vary significantly in implementation, features and Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 9 behavior, there are common design principles and technology requirements that NoSQL databases share in common. NoSQL systems NoSQL systems provide horizontal scaling across a very large numbers of servers (tens, hundreds or even thousands of servers) by using a technique that has been deployed for many years in conventional RDBMS databases, called sharding. Sharding requires that a separate database run on each server and that the data be physically partitioned so that each database has its own subset of the data stored on its own local disks. NoSQL systems maximize throughput by limiting how the sharded data is managed and accessed. NoSQL systems typically do not provide support for operations that require accessing multiple shards of data – this includes joins, distributed transactions and coordinated/synchronized schema changes – because of the I/O overhead and coordination required. NoSQL systems typically implement limited transaction capability, either by relaxing transaction consistency (by using “eventual consistency”), providing shard-local ACID consistency only or by disallowing transactions altogether. The primary use case for NoSQL systems is horizontally distributed, sharded data sets with a flexible schema, and simple read/write operations. More complex, general purpose functionality is not considered to be the primary goal of NoSQL databases. Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 10 Oracle NoSQL Database – Value Proposition The NoSQL Database is a highly scalable, high performance, high availability distributed Key-Value database for near-real-time access to data. Like other NoSQL databases, it is focused on providing horizontally scalable, simple data management functionality. It differentiates itself from other NoSQL databases because:  Oracle NoSQL Database is based on mature, field-proven, production quality software. Oracle NoSQL Database utilizes Berkeley DB Java Edition as the basis for its storage and replication technology. Berkeley DB Java Edition has been powering mission-critical applications in the field for almost a decade, including companies like Amazon.com, LinkedIn, Yammer, Sand Vine and Macys.com. Many of the other NoSQL databases are either creating their own storage and replication layers from scratch (and finding out how difficult that is). Others are adopting existing open source storage and replication technology (which may not be well suited to their specific requirements).  Oracle NoSQL Database is the only NoSQL database developed and supported by a major database vendor.  Oracle NoSQL Database is integrated with other related products like Oracle Database, Oracle Business Intelligence, Event processing, Oracle Spatial, RDF & Graph, Oracle Coherence and Hadoop/MapReduce . Other NoSQL databases require that the customer figure out how to implement integration with their other IT assets. Most Big Data projects require multiple, complementary data management technologies and Oracle is the only vendor that has a comprehensive offering of integrated technologies.  Oracle NoSQL Database has unique features like configurable ACID transactions, and Dynamic Storage Node rebalancing, that many of the other NoSQL systems lack. ACID transactions make application development much simpler and Dynamic Storage Node rebalancing ensures robust, consistent and scalable database deployment. Additionally, the Oracle NoSQL Database is not only proven to deliver high throughput, but also guarantees predictable throughput and latency through automatic, highly tuned database cache eviction policies and Java garbage collection parameters. Other NoSQL databases often have unexpected limitations when it comes to performance management and predictability, specifically related to file system cleanup, compaction and Java garbage collection. A report from Amazon Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 11 based on actual studies showed that 100ms (that’s 1/10 of a second!) of added latency in accessing a page from the browser caused a 1% decrease in revenue!  Oracle NoSQL Database is much easier to deploy and simpler to manage4 .  Oracle NoSQL Database has been extensively benchmarked using the industry standard YCSB benchmark5 , that has conclusively demonstrated a) scalability to hundreds of storage nodes, b) across 10s of TB of data, c) 1.25 million operations per second.6 Oracle NoSQL Database is Oracle’s flagship NoSQL database product. Sales can confidently recommend the Oracle NoSQL Database, alongside the Oracle Database, in order to address the needs of cost effective extreme data scalability and flexibility. Other NoSQL databases There are over a hundred different NoSQL databases currently being offered on the market. Customers who become early adopters of open source NoSQL databases often invest in programming staffs that participate in the development and maintenance of these products. There are currently no standards in the NoSQL technology space, so each NoSQL product is different in terms of features, behavior and implementation. It is not within the scope of this paper to do a full competitive analysis of all of the alternative NoSQL databases that are available. For product specific competitive comparisons, see the competition section of the Oracle NoSQL Database OTN site. 4 Oracle NoSQL DB Quickstart Guide; http://docs.oracle.com/cd/NOSQL/html/quickstart.html 5 The YCSB benchmark is the de-facto standard benchmark used for NoSQL databases. More information can be found here: http://research.yahoo.com/Web_Information_Management/YCSB 6 Oracle NoSQL Database benchmark links: https://blogs.oracle.com/charlesLamb/entry/oracle_nosql_database_exceeds_1 https://blogs.oracle.com/charlesLamb/entry/oracle_nosql_database_performance_tests Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 12 The value proposition of Oracle NoSQL database is discussed in the previous section. Oracle NoSQL Database as the best integrated, highest quality NoSQL database from the best of breed purveyor of enterprise-class supported, industrial grade database software. NoSQL Database and RBBMS Database The NoSQL Databases and the RDBMS Database complement each other. Each solves a different type of requirement. The NoSQL Database is designed to cost-effectively manage large volumes of simple, structured and unstructured data. However, it is often the case that important subsets of that data need to be loaded into the RDBMS Database in order to access more advanced capabilities like complex queries, data security, data lifecycle management and Advanced Analytics. Typically the same ETL-class tools that support loading Hadoop data into RBDMS systems are also used for loading NoSQL data into an RDBMS. The Oracle NoSQL Database supports the Oracle Big Data Connectors (ODI and OLH), as well as direct access to its data via Oracle Database External Tables, allowing customers to combine relational and NoSQL data in the same query. Just as Oracle NoSQL Database data may be moved into Oracle Database for analytical functions, data stored in the Oracle Database may be moved into Oracle NoSQL Database in order to enable high volume, high velocity web-based applications. A typical example of this is moving certain aspects of customer profile data and inventory information into the NoSQL Database in order to drive a webbased consumer retail application (e.g. loyalty card programs). As discussed earlier, customers may choose to replace their RDBMS Database with Oracle NoSQL Database, but for only the subset of data and functionality that are best suited for the NoSQL technology approach. Most customers will use a combination of an RDBMS and Oracle NoSQL Database to address their overall data management requirements. Where do NoSQL Database projects get started? Although a NoSQL database project may start in almost any technical group, our experience indicates that there are a few common ways that NoSQL projects get started. NoSQL technology is typically evaluated in the context of Big Data data management infrastructure and can get started in almost any technical group within a customer’s Oracle White Paper— NoSQL and SQL Introspective: Oracle NoSQL Database 11g Release 2 13 organization. It is often the case that these projects and opportunities are being driven by application developers. When discussing application areas for a NoSQL Database or when introducing the technology to customers, it is crucial to include Application Architects and Developers in the technology discussion. It is often within these groups that NoSQL technology evaluation and research is occurring. Customers are constantly looking for opportunities to do more (increase business value, deliver new services, derive new insights) with less (using commodity hardware, open source software). The potential value and capabilities of Big Data hold a lot of promise and customers are actively exploring ways to leverage Big Data and/or NoSQL technologies. This technical exploration often leads to different kinds of Big Data or NoSQL initiatives within the company. These initiatives fall under some broad categories:  New initiatives (experiments) to leverage Big Data: For example, some customers are leveraging social media data from Facebook and Twitter for example, to provide a more comprehensive view of their customers. They are combining these Big Data insights with their NoSQL customer profile applications.  New applications using existing data: Projects to take existing data and reuse it in new ways. For example, an entertainment company might leverage customer visits and interactions (e.g. hotel bookings, restaurant and car reservations, retail activities, etc.) from existing systems and repurpose that data into a NoSQL repository to provide a personalized experience for visitors.  Cost effective data management choices: Use NoSQL technologies to provide efficient special purpose data management for existing applications that do not require the full capabilities of an RDBMS. For example, a large electricity utility company is planning to manage several terabytes of metering data in a special purpose NoSQL repository because the application requirements, data structures and queries are very simple.  Private cloud services: Several large customers have initiatives to build an inhouse data services platform that includes RDBMS, NoSQL, MapReduce and other related technologies. By combining the three technologies together into a service platform, customers can harness the rich feature set of an RDBMS, plus the distributed, scalable, simple storage and processing of NoSQL (for interactive data usage) and Hadoop/MapReduce (for batch usage). Conclusion NoSQL database technology is here to stay. It addresses specific technical requirements that are not as efficiently or cost effectively addressed with other data management technologies, including RDBMS systems. The RDBMS and Oracle NoSQL Database are complementary in the sense that they work together to address the overall data management needs of our customers, each providing the technical capabilities required by today’s complex and evolving applications. Customers can rely on the quality, integration and support of the Oracle Database and Oracle NoSQL Database, and deploy their applications with confidence, maximizing their technology investment and minimizing their risk; as opposed to opting for other NoSQL database products with unknown scalability, quality and integration challenges. NoSQL and SQL Introspective October 2013 Author: Anuj Sahni, Dave Segleau Contributing Authors: Oracle NoSQL Database Team Oracle Corporation World Headquarters 500 Oracle Parkway Redwood Shores, CA 94065 U.S.A. Worldwide Inquiries: Phone: +1.650.506.7000 Fax: +1.650.506.7200 oracle.com Copyright © 2013, Oracle and/or its affiliates. All rights reserved. This document is provided for information purposes only and the contents hereof are subject to change without notice. This document is not warranted to be error-free, nor subject to any other warranties or conditions, whether expressed orally or implied in law, including implied warranties and conditions of merchantability or fitness for a particular purpose. We specifically disclaim any liability with respect to this document and no contractual obligations are formed either directly or indirectly by this document. This document may not be reproduced or transmitted in any form or by any means, electronic or mechanical, for any purpose, without our prior written permission. Oracle is a registered trademark of Oracle Corporation and/or its affiliates. Other names may be trademarks of their respective owners. 0109



*** 

No comments:

Post a Comment