Search for {created_by:paolo-basso} {type:page}

01 The Data-driven Virtuous Cycle

SMBUD - Systems and Methods for Big and... 01-02 Big Data and data-driven decisions

Collect Widely different data sources: Logs IoT Social Media Ecommerce Traditional businness databases Analyze The analyse step is focussed on description and prediction. It must be stakeholder drive where with stakeholders we mean: Customers / Clients Sta...

02 What is Big Data and new ways to solve problems

SMBUD - Systems and Methods for Big and... 01-02 Big Data and data-driven decisions

Definition of the four Vs Volume (Data at scale): terabytes to hexabyte of data cumulated on cheaper and cheaper storages Variety (Data in many forms): structured, unstructured, text, images, video and general multimedia Velocity (Data in motion): straming...

03 Data-driven decisions

SMBUD - Systems and Methods for Big and... 01-02 Big Data and data-driven decisions

Using data decisions no longer have to be made in the dark or based on gut instinct; they can be based on evidence, experiments and more accurate forecasts. Data driven organizations perform better, are operationally more predictable and are more profitable. ...

01 ER

SMBUD - Systems and Methods for Big and... 03 ER and Relational Data Models

Foundations An entity–relationship model (or ER model) describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types (which classify the things of interest) and specifies relationships that can exist ...

02 Relational Model

SMBUD - Systems and Methods for Big and... 03 ER and Relational Data Models

Keys Superkey: a set of attributes K is a superkey for a relation r if r does not contain two distinct tuples t1 and t2 with t1[K]=t2 [K]; Key: K is a key for r if K is a minimal superkey (that is, there exists no other superkey K’ of r that is contained in K ...

01 Introduction to API

SMBUD - Systems and Methods for Big and... 04 Data ingestion and API

Data ingestion Data ingestion is the first and fundamental step of any Data Analysis Pipeline. The focus of this section is on how is it possible to collect data from publicly available sources over the web. It is in fact a common practice nowadays to integrat...

02 RESTful API

SMBUD - Systems and Methods for Big and... 04 Data ingestion and API

Is a standardized resource based way of designing API. The RESTful API uses the available HTTP verbs to perform CRUD ("Create, Read, Update, Delete") operations based on the “context”: Collection: A set of items (e.g.: /users) Item: A specific item in a col...

03 Scraping

SMBUD - Systems and Methods for Big and... 04 Data ingestion and API

Web crawling, data crawling, and web scraping are all names to define the process of data extrac- tion. With the help of this technique, data is extracted from various website pages and repositories. The data is then saved and stored for further use and analys...

01 NoSQL General Concepts

SMBUD - Systems and Methods for Big and... 05 NoSQL introduction

Differences with the traditional data model Schema Less Approach This new types of data model require lot of flexibility,which is usually implemented by getting rid of the traditional fixed schema of the relational model, using a so called ”schema-less” appro...

02 Transactional Properties in NoSQL

SMBUD - Systems and Methods for Big and... 05 NoSQL introduction

Transaction is SQL, ACID In the relational world we are used to having the concept of a transaction, an elementary unit of work encapsulated by begin and commit commands characterized by ACID properties: Atomicity: the operations contained in a transaction e...

03 Brief NoSQL history

SMBUD - Systems and Methods for Big and... 05 NoSQL introduction

MultiValue databases at TRW in 1965. DBM is released by AT&T in 1979. Lotus Domino released in 1989. Carlo Strozzi used the term NoSQL in 1998 to name his lightweight, opensource relational database that did not expose the standard SQL interface. Graph databa...

04 A map of NoSQL technologies

SMBUD - Systems and Methods for Big and... 05 NoSQL introduction

Key-Value Store A key that refers to a payload (actual content / data). E.g. MemcacheDB, Azure Table Storage, Redis Key-value databases work by storing dictionaries or hash tables, which are a collection of key-value pairs in which a key serves as a unique id...

01 Graph Theory

SMBUD - Systems and Methods for Big and... 06 Graph Stores

Basic definitions The Graph is a data structure that was first used in 1736 to represent a city and its water canals and have been used ever since for a huge number of optimizations of problems that can be modeled has a set of entities connected by a relation ...

02 Graph Databases

SMBUD - Systems and Methods for Big and... 06 Graph Stores

Motivations The table based structure of relational databases makes it hard to represent relationships between rows in the same table,and moreover whenever someone needs to find a relationship between records of different tables the db has to perform a JOIN op...

03 Neo4J

SMBUD - Systems and Methods for Big and... 06 Graph Stores

The most popular graph database is Neo4J, implemented in Java by Neo Technologies, and it has the following characteristics: Open Source Operational db ACID guarantees Not efficient for large scale graph analysis Nodes and edges are a native feature Each node...