Advanced Search
Search Results
171 total results found
01 The Data-driven Virtuous Cycle
Collect Widely different data sources: Logs IoT Social Media Ecommerce Traditional businness databases Analyze The analyse step is focussed on description and prediction. It must be stakeholder drive where with stakeholders we mean: Customers / Clients Sta...
02 What is Big Data and new ways to solve problems
Definition of the four Vs Volume (Data at scale): terabytes to hexabyte of data cumulated on cheaper and cheaper storages Variety (Data in many forms): structured, unstructured, text, images, video and general multimedia Velocity (Data in motion): straming...
03 Data-driven decisions
Using data decisions no longer have to be made in the dark or based on gut instinct; they can be based on evidence, experiments and more accurate forecasts. Data driven organizations perform better, are operationally more predictable and are more profitable. ...
01 ER
Foundations An entity–relationship model (or ER model) describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types (which classify the things of interest) and specifies relationships that can exist ...
02 Relational Model
Keys Superkey: a set of attributes K is a superkey for a relation r if r does not contain two distinct tuples t1 and t2 with t1[K]=t2 [K]; Key: K is a key for r if K is a minimal superkey (that is, there exists no other superkey K’ of r that is contained in K ...
01 Introduction to API
Data ingestion Data ingestion is the first and fundamental step of any Data Analysis Pipeline. The focus of this section is on how is it possible to collect data from publicly available sources over the web. It is in fact a common practice nowadays to integrat...
02 RESTful API
Is a standardized resource based way of designing API. The RESTful API uses the available HTTP verbs to perform CRUD ("Create, Read, Update, Delete") operations based on the “context”: Collection: A set of items (e.g.: /users) Item: A specific item in a col...
03 Scraping
Web crawling, data crawling, and web scraping are all names to define the process of data extrac- tion. With the help of this technique, data is extracted from various website pages and repositories. The data is then saved and stored for further use and analys...
01 NoSQL General Concepts
Differences with the traditional data model Schema Less Approach This new types of data model require lot of flexibility,which is usually implemented by getting rid of the traditional fixed schema of the relational model, using a so called ”schema-less” appro...
02 Transactional Properties in NoSQL
Transaction is SQL, ACID In the relational world we are used to having the concept of a transaction, an elementary unit of work encapsulated by begin and commit commands characterized by ACID properties: Atomicity: the operations contained in a transaction e...
03 Brief NoSQL history
MultiValue databases at TRW in 1965. DBM is released by AT&T in 1979. Lotus Domino released in 1989. Carlo Strozzi used the term NoSQL in 1998 to name his lightweight, opensource relational database that did not expose the standard SQL interface. Graph databa...
04 A map of NoSQL technologies
Key-Value Store A key that refers to a payload (actual content / data). E.g. MemcacheDB, Azure Table Storage, Redis Key-value databases work by storing dictionaries or hash tables, which are a collection of key-value pairs in which a key serves as a unique id...
01 Graph Theory
Basic definitions The Graph is a data structure that was first used in 1736 to represent a city and its water canals and have been used ever since for a huge number of optimizations of problems that can be modeled has a set of entities connected by a relation ...
02 Graph Databases
Motivations The table based structure of relational databases makes it hard to represent relationships between rows in the same table,and moreover whenever someone needs to find a relationship between records of different tables the db has to perform a JOIN op...
03 Neo4J
The most popular graph database is Neo4J, implemented in Java by Neo Technologies, and it has the following characteristics: Open Source Operational db ACID guarantees Not efficient for large scale graph analysis Nodes and edges are a native feature Each node...
Introduction
Preliminaries A search algorithm takes a search problem as input and returns a solution, or an indication of failure. We consider algorithms that superimpose a search tree over the state-space graph, forming various paths from the initial state, trying to find...
Breadth-first search
When all actions have the same cost, an appropriate strategy is breadth-first search, in which the root node is expanded first, then all the successors of the root node are expanded next, then their successors, and so on. This is a systematic search strategy t...
Uniform-cost search
When actions have different costs, an obvious choice is to use best-first search where the evaluation function is the cost of the path from the root to the current node. The idea is that while breadth-first search spreads out in waves of uniform depth—first de...
Depth-first search
Depth-first search always expands the deepest node in the frontier first. It could be implemented as a call to BEST-FIRST-SEARCH where the evaluation function $f$ is the negative of the depth. However, it is usually implemented not as a graph search but as a t...
Depth-limited and iterative deepening search
Depth-limited To keep depth-first search from wandering down an infinite path, we can use depth-limited search, a version of depth-first search in which we supply a depth limit, $l$, and treat all nodes at depth $l$ as if they had no successors. Evaluation an...