database schema

68 results back to index


Catalyst 5.8: The Perl MVC Framework by Antano Solar John, Jonathan Rockway, Solar John Antano

business logic, c2.com, create, read, update, delete, database schema, Debian, en.wikipedia.org, Firefox, MVC pattern, Ruby on Rails, social intelligence, web application

Installing SQLite The first step is to install SQLite from the SQLite website at http://sqlite.org/ (or through your distribution's package manager, the package is usually called "sqlite3"). You'll also need to install the Perl version of SQLite, called DBD::SQLite, and DBIx::Class itself as well as an adapter that will link our database schema definition to the Catalyst application. You can install these, except for SQLite, using cpan with the following: $ cpan -i DBD::SQLite DBIx::Class Catalyst::Model::DBIC::Schema Creating a database schema Once the dependencies are installed, we'll create a sample database with the sqlite3 command-line utility: $ sqlite3 tmp/database SQLite version 3.3.8 Enter ".help" for instructions sqlite> CREATE TABLE test (id INTEGER PRIMARY KEY, subject TEXT, message TEXT, date DATETIME); sqlite> INSERT INTO test (id, subject, message, date) VALUES(NULL, "test message", "this is a test message", "2005-01-01 00:00"); sqlite> INSERT INTO test (id, subject, message, date) VALUES(NULL, "another test message", "this is a another test message, hooray", "200501-01 00:01"); sqlite> SELECT * FROM test; 1|test message|this is a test message|2005-01-01 00:00 2|another test message|this is a another test message, hooray|2005-01-01 00:01 sqlite> .quit $ Here we created a test table in a database file called tmp/database and added two rows to it

The header, footer, messages, and stylesheets will all be handled automatically and will be easy to customize later. Database design The first real step will be to think about what kind of data we need to store and then design a database schema to efficiently store that data. To keep things simple (but realistic), let's set our specification as follows: • The address book should keep track of multiple addresses for a person • Each person can have a first and last name • Each address can have a street address, a phone number, and an e-mail address Translated into a relational database schema, that means we'll have a people table that will assign a unique identifier to each firstname and lastname pair. Then we'll have an addresses table that will allow each person to have multiple addresses, each consisting of a unique ID (so it's easy to reference later), a location ("Office", "Home"), a free-form postal address (so we don't have to worry about the complexity of having a city, country, state, postal code, and so on), a phone number, and an e-mail address.

See CPAN config method 68 configuration management 192 Content-type: header 157 Controller 182 Controller class 8 count_users function 142 COUNT function 117 CPAN about 10 Catalyst, installing from 12 disadvantages 12 modules, installing 41 CRUD 55 D Data::Page object 82 database installing 198 database, accessing ways Object-relational mapper (ORM) 35 database, skeleton application database model, creating 37 [ 223 ] database model, using 37-39 database schema, installing 36 SQLite, installing 36 database model creating 37, 39, 141, 142, 143 using 37, 39 database schema creating 36 data putting on web, ChatStat application _controversial function 126 action, adding 136 actions, adding to display ups or downs on page 139 actions, modifying 136, 138 all_things method 132 custom ResultSets method, writing 124 data, putting on web 121-123 everything query 131 everything ResultSet method 131 having attribute 127 highest_rated, custom ResultSets method 124, 125 index controller method, changing 134, 135 issues 128, 130 least_controversial, custom ResultSets method 124, 126 list_things macro, reusing 136 lowest_rated, custom ResultSets method 124, 125 macros, creating 121 macros, using 123 main page 120 main page, writing 122 most_controversial, custom ResultSets method 124, 126 one_thing method 132 person() links 134 person, macros 121 ResultSet, adding 130 root/src/thing.tt2, template 133, 134 score, macros 121 template root/src/pair.tt2, creating 121, 122 thing, macros 121 uri_for_action, using for link 130 data types 186 dbh_do method 144, 145 DBI::Class 69 dbh_do method 144 DBIC 35 DBIC schema ($self) 113 DBIx::Class 15.


pages: 458 words: 46,761

Essential Sqlalchemy by Jason Myers, Rick Copeland

create, read, update, delete, database schema, microservices, Ruby on Rails, side project, SQL injection, web application

In addition to the generic types listed in Table 1-1, both SQL standard and vendor-specific types are available and are often used when a generic type will not operate as needed within the database schema due to its type or the specific type specified in an existing schema. A few good illustrations of this are the CHAR and NVARCHAR types, which benefit from using the proper SQL type instead of just the generic type. If we are working with a database schema that was defined prior to using SQLAlchemy, we would want to match types exactly. It’s important to keep in mind that SQL standard type behavior and availability can vary from database to database.

These modes can be used separately or together depending on your preference and the needs of your application. SQLAlchemy Core and the SQL Expression Language The SQL Expression Language is a Pythonic way of representing common SQL statements and expressions, and is only a mild abstraction from the typical SQL language. It is focused on the actual database schema; however, it is standardized in such a way that it provides a consistent language across a large number of backend databases. The SQL Expression Language also acts as the foundation for the SQLAlchemy ORM. ORM The SQLAlchemy ORM is similar to many other object relational mappers (ORMs) you may have encountered in other languages.

If we use hard references, such as cookies.c.cookie_id, in our ForeignKey definitions it will perform that resolution during module initialization and could fail depending on the order in which the tables are loaded. You can also define a ForeignKeyConstraint explicitly, which can be useful if trying to match an existing database schema so it can be used with SQLAlchemy. This works in the same way as before when we created keys, constraints, and indexes to match name schemes and so on. You will need to import the ForeignKeyConstraint from the sqlalchemy module prior to defining one in your table definition. The following code shows how to create the ForeignKeyConstraint for the order_id field between the line_items and orders table: ForeignKeyConstraint(['order_id'], ['orders.order_id']) Up until this point, we’ve been defining tables in such a way that SQLAlchemy can understand them.


pages: 357 words: 63,071

Essential SQLAlchemy by Rick Copeland

database schema, defense in depth, domain-specific language, SQL injection, value engineering, web application

Engines and MetaData This chapter introduces SQLAlchemy’s Engine and MetaData classes. The Engine class provides database connectivity, including a connection pool with various strategies for acquiring connections from the pool. The MetaData class maintains information about your database schema, including any tables and indexes defined. In this chapter, you will learn how to define a new database schema using MetaData as well as how to connect a MetaData instance to an existing schema. Engines and Connectables The SQLAlchemy-provided Engine class is responsible for managing the connection to the database. It does this by incorporating a database connection pool and a database-specific Dialect layer to translate the SQL expression language (Chapter 5) into database-specific SQL.

For instance, to retrieve all the users created in 2007, you would write: statement = user_table.select(and_( user_table.c.created >= date(2007,1,1), user_table.c.created < date(2008,1,1)) result = statement.execute() In order to use the SQL expression language, you need to provide SQLAlchemy with information about your database schema. For instance, if you are using the user table mentioned previously, your schema definition might be the following: metadata=MetaData('sqlite://') # use an in-memory SQLite database user_table = Table( 'tf_user', metadata, Column('id', Integer, primary_key=True), Column('user_name', Unicode(16), unique=True, nullable=False), Column('email_address', Unicode(255), unique=True, nullable=False), Column('password', Unicode(40), nullable=False), Column('first_name', Unicode(255), default=''), Column('last_name', Unicode(255), default=''), Column('created', DateTime, default=datetime.now)) If you would rather use an existing database schema definition, you still need to tell SQLAlchemy which tables you have, but SQLAlchemy can reflect the tables using the database server’s introspection capabilities.

For instance, if you are using the user table mentioned previously, your schema definition might be the following: metadata=MetaData('sqlite://') # use an in-memory SQLite database user_table = Table( 'tf_user', metadata, Column('id', Integer, primary_key=True), Column('user_name', Unicode(16), unique=True, nullable=False), Column('email_address', Unicode(255), unique=True, nullable=False), Column('password', Unicode(40), nullable=False), Column('first_name', Unicode(255), default=''), Column('last_name', Unicode(255), default=''), Column('created', DateTime, default=datetime.now)) If you would rather use an existing database schema definition, you still need to tell SQLAlchemy which tables you have, but SQLAlchemy can reflect the tables using the database server’s introspection capabilities. In this case, the schema definition reduces to the following: users_table = Table('users', metadata, autoload=True) Although the SQLAlchemy SQL expression language is quite powerful, it can still be tedious to manually specify the queries and updates necessary to work with your tables.


pages: 509 words: 92,141

The Pragmatic Programmer by Andrew Hunt, Dave Thomas

A Pattern Language, Broken windows theory, business logic, business process, buy low sell high, c2.com, combinatorial explosion, continuous integration, database schema, domain-specific language, don't repeat yourself, Donald Knuth, Ford Model T, Free Software Foundation, general-purpose programming language, George Santayana, Grace Hopper, higher-order functions, if you see hoof prints, think horses—not zebras, index card, Kaizen: continuous improvement, lateral thinking, loose coupling, Menlo Park, MVC pattern, off-by-one error, premature optimization, Ralph Waldo Emerson, revision control, Schrödinger's Cat, slashdot, sorting algorithm, speech recognition, systems thinking, the Cathedral and the Bazaar, traveling salesman, urban decay, Y2K

With Perl you can manipulate text, interact with programs, talk over networks, drive Web pages, perform arbitrary precision arithmetic, and write programs that look like Snoopy swearing. Tip 28 Learn a Text Manipulation Language To show the wide-ranging applicability of text manipulation languages, here's a sample of some applications we've developed over the last few years. Database schema maintenance. A set of Perl scripts took a plain text file containing a database schema definition and from it generated: – The SQL statements to create the database – Flat data files to populate a data dictionary – C code libraries to access the database – Scripts to check database integrity – Web pages containing schema descriptions and diagrams – An XML version of the schema Java property access.

Of course, this scheme works only if you make the code generation part of the build process itself.[9] [9] Just how do you go about building code from a database schema? There are several ways. If the schema is held in a flat file (for example, as create table statements), then a relatively simple script can parse it and generate the source. Alternatively, if you use a tool to create the schema directly in the database, then you should be able to extract the information you need directly from the database's data dictionary. Perl provides libraries that give you access to most major databases. Figure 3.3. Active code generator creates code from a database schema Another example of melding environments using code generators happens when different programming languages are used in the same application.

This is a clear violation of the DRY principle. To correct this problem, we need to choose the authoritative source of information. This may be the specification, it may be a database schema tool, or it may be some third source altogether. Let's choose the specification document as the source. It's now our model for this process. We then need to find a way to export the information it contains as different views—a database schema and a high-level language record, for example.[10] [10] See It's Just a View, page 157, for more on models and views. If your document is stored as plain text with markup commands (using HTML, LATEX, or troff, for example), then you can use tools such as Perl to extract the schema and reformat it automatically.


pages: 378 words: 67,804

Learning Android by Marko Gargenta

business logic, create, read, update, delete, database schema, Firefox, loose coupling, slashdot, SQL injection, web application

buttons, Programmatic User Interface, The StatusActivity Layout, Creating Your Application-Specific Object and Initialization Code, Update StatusActivity to Load the Menu, Add Menu Items, Toggle Service, Intent Service C C implementation, The APK, The Native Development Kit (NDK), Documentation and Standardized Headers, The JNI Header File, C Implementation Canvas, Custom Rose Widget classpaths, Adding the jtwitter.jar Library close() method, Refactoring Status Data coarse location permissions, The manifest file color, adding, Adding Color and Graphics, Adding Color Compass demo, Compass Demo, Custom Rose Widget compiling code, Compiling Code and Building Your Projects: Saving Files, Updating the Manifest File for Internet Permission comprehensive platform, Comprehensive, Yamba Project Overview content providers, Content Providers, Project Design, Creating a Content Provider, Getting the Data Type, Using Content Providers Through Widgets, Summary creating, Creating a Content Provider, Getting the Data Type overview, Content Providers, Project Design using through widgets, Using Content Providers Through Widgets, Summary CRUD principle, Content Providers cursors, Cursors, Creating the TimelineActivity Class, Querying Data custom permissions, Adding Custom Permissions to Send and Receive Broadcasts, Declaring Permissions in the Manifest File D .d() severity level, Logging in Android Dalvik, Dalvik databases, The Database, The Database Schema and Its Creation, Database Constraints, Creating the TimelineActivity Class constraints on, Database Constraints overview, The Database schema, creating, The Database Schema and Its Creation working with, Creating the TimelineActivity Class DbHelper, DbHelper DDMS, Setting Up Android Development Tools, LogCat from the Eclipse DDMS perspective, Exploring the Filesystem, The User Data Partition, Verify that the database was created debugging, Debugging Android Apps declarative user interface, Declarative User Interface delete() method, Four Major Operations, Creating a Content Provider design philosophy, Design Philosophy Destroyed State, Destroyed state development tools, Setting Up Android Development Tools distribution of applications, Application Distribution doInBackground() method, AsyncTask E .e() severity level, Logging in Android Eclipse Android Development Tools (ADT), Declarative User Interface Eclipse IDE, Installing Eclipse, Logging in Android, Using the WYSIWYG editor in Eclipse, Prefs Resource, PrefsActivity editing code in, Prefs Resource, PrefsActivity installing, Installing Eclipse Organize Imports tool, Logging in Android WYSIWYG Editor, Using the WYSIWYG editor in Eclipse emulators, The Emulator events, Other UI Events, Other UI Events execSQL() method, Four Major Operations, First Example F fetchStatusUpdates() method, Refactoring Status Data, Broadcasting Intents Fibonacci demo, An NDK Example: Fibonacci, Summary file system, Filesystem, The Filesystem Explained, Summary files, saving, Compiling Code and Building Your Projects: Saving Files fine location permissions, The manifest file format() method, Creating the TimelineActivity Class FrameLayout, FrameLayout G garbage collector, Creating the TimelineActivity Class Geocoder, Where Am I?

To use File Explorer in Eclipse, either open the DDMS perspective in the top-right corner of your Eclipse or go to Windows→Show View→Other…→Android→File Explorer. This will open the view of the filesystem of the device you are currently looking at. So far, you know that the database file is there, but don’t really know whether the database schema was created properly. The next section addresses that. Using sqlite3 Android ships with the command-line tool sqlite3. This tool gives you access to the database itself. To see whether your database schema was created properly: Open up your terminal or command-line window. Type adb shell to connect to your running emulator or physical phone. Change the directory to the location of your database file by typing cd /data/data/com.marakana.yamba/databases/.

startActivity() method, Update StatusActivity to Handle Menu Events, Initial App Setup, Pending intents Starting State, Starting state startManagingCursor() method, Creating the TimelineActivity Class, Querying Data startService() method, Update the Options Menu Handling, Pending intents Status activity, Updating the Status Activity status data, Looping in the Service, Update UpdaterService, Refactoring Status Data, Refactoring Status Data, Defining the URI, Implementing the YambaWidget class status updates, The StatusActivity Layout, Looping in the Service, The Database, Broadcast Receivers, The TimelineReceiver, Updating the Services to Enforce Permissions, Using Content Providers Through Widgets, Summary, Intent Service, Sending Notifications, Summary checking for new, Looping in the Service using IntentService to run, Intent Service notification of, The TimelineReceiver, Sending Notifications, Summary permissions for, Updating the Services to Enforce Permissions screen, The StatusActivity Layout sending, Broadcast Receivers storing locally, The Database widget for displaying, Using Content Providers Through Widgets, Summary status.xml, The StatusActivity Layout StatusActivity, The StatusActivity Layout, The StatusActivity Java Class, AsyncTask, Update StatusActivity to Load the Menu, Simplifying StatusActivity Stopped State, Stopped state stopService() method, Update the Options Menu Handling, The Network Receiver strings resource, Important Widget Properties Stub() method, Implementing the Service, Binding to the Remote Service subscribers, About Broadcast Receivers system services, Services, Project Design, Services, UpdaterService, Update the Manifest File, Add Menu Items, Testing the Service, Looping in the Service, Testing the Service, System Services, Compass Demo, Custom Rose Widget, Common Steps in Using System Services, Location Service, Updating the Status Activity, Intent Service, Intent Service, Sending Notifications adding and handling menus, Add Menu Items Android Services vs. native services, Services common steps in using, Common Steps in Using System Services Compass demo, Compass Demo, Custom Rose Widget creating, UpdaterService defining in manifest file, Update the Manifest File intents, Intent Service, Intent Service location service, Location Service, Updating the Status Activity looping in, Looping in the Service notification service, Sending Notifications overview, Services, System Services and project design, Project Design testing, Testing the Service, Testing the Service T TableLayout, TableLayout TableRow, TableLayout testing, Testing the Service, Testing the Service, Testing the Service, Testing That It All Works text property, Important Widget Properties TextWatcher, Other UI Events Thread.sleep() method, Looping in the Service, Looping in the Service threading, Threading in Android timeline activity example, TimelineActivity, Creating an Adapter in TimelineActivity.java, Toggle Service, The TimelineReceiver timeline adapter example, TimelineAdapter, ViewBinder: A Better Alternative to TimelineAdapter timeline receivers, The TimelineReceiver, The TimelineReceiver, Updating the Services to Enforce Permissions toggle service, Toggle Service Twitter, A Real-World Example, Broadcast Receivers, The Yamba Application, Creating Your Application-Specific Object and Initialization Code, Updating the Manifest File for Internet Permission, AsyncTask, Other UI Events, Other UI Events, Other UI Events, Shared Preferences, UpdaterService, Pulling Data from Twitter, Pulling Data from Twitter, The Database Schema and Its Creation, Update UpdaterService, Lists and Adapters 140-character counter, Other UI Events, Other UI Events creating compatible apps, Creating Your Application-Specific Object and Initialization Code, Updating the Manifest File for Internet Permission, AsyncTask, Other UI Events, Shared Preferences, Lists and Adapters example of app, A Real-World Example, Broadcast Receivers pulling data from, UpdaterService, Pulling Data from Twitter, Pulling Data from Twitter, The Database Schema and Its Creation, Update UpdaterService and Yamba, The Yamba Application U UI (user interface), Part 1: Android User Interface, Two Ways to Create a User Interface, Creating Your Application-Specific Object and Initialization Code, Optimizing the User Interface Android objects, Creating Your Application-Specific Object and Initialization Code optimizing, Optimizing the User Interface two ways to create, Two Ways to Create a User Interface Uniform Resource Identifier (URI), Defining the URI unmarshaling, The Android Interface Definition Language update() method, Four Major Operations, Creating a Content Provider updateAppWidget() method, Implementing the YambaWidget class UpdaterService, UpdaterService, Pulling Data from Twitter, Update UpdaterService, Update UpdaterService, BootReceiver, The TimelineReceiver, Broadcasting Intents, Updating the Services to Enforce Permissions updateStatus() method, Single Thread URI (Uniform Resource Identifier), Defining the URI user data partition, Filesystem Partitions, The User Data Partition user interface, Two Ways to Create a User Interface (see UI) user preferences, Preferences (see preferences) username, The User Data Partition V ViewBinder, ViewBinder: A Better Alternative to TimelineAdapter views and layouts, Views and Layouts viruses, What about viruses, malware, spyware, and other bad things?


pages: 224 words: 48,804

The Productive Programmer by Neal Ford

anti-pattern, business process, c2.com, continuous integration, database schema, domain-specific language, don't repeat yourself, Firefox, general-purpose programming language, knowledge worker, Larry Wall, Paradox of Choice, Ruby on Rails, side project, type inference, web application, William of Occam

Using an O/R mapper introduces repetition in projects where we have essentially the same information in three places: in the database schema, in an XML mapping document, and in a class file. This represents two violations of the DRY principle. The solution to this problem is to create a single representation and generate the other two. The first step is deciding who is the “official” holder of this knowledge. For example, if the database is the canonical source, generate the XML mapping and the corresponding class files. In this example, I used Groovy (the scripting language dialect of Java) to solve the mismatch. On the project that spawned this example, the developers had no control over the database schema. Thus, I decided that the database is the canonical representation of the data.

The Groovy script runs as part of the build process so that any time the database schema changes, it automatically generates the corresponding mapping file and the Java class to which it maps. Migrations Another situation where duplication creeps into projects also arises from the impedance mismatch between code and SQL. Lots of projects treat source code and SQL as completely separate artifacts, sometimes created by entirely separate groups of developers. Yet, for the source code to function correctly, it must rely on a certain version of the database schema and data. There are a couple of ways to solve this problem, one framework is specific and the other was designed to work across frameworks and languages.

NOTE Never keep two copies of the same thing (like code and a diagram describing it). If you are going to draw your diagrams first, use the tool to generate the code. If you create informal diagrams using the whiteboard approach and need more formal diagrams later, generate them from code. Otherwise, they will always get out of sync. Database Schemas Just like class diagrams, database schemas are a dangerous area for needless repetition. SchemaSpy§ is an open source tool that does for database entity/relationship diagrams what yDoc does for code. It attaches to the database and produces table information (including metadata) and relationship diagrams like the one shown in Figure 5-6


pages: 90 words: 17,297

Deploying OpenStack by Ken Pepple

Amazon Web Services, cloud computing, database schema, Infrastructure as a Service, Kickstarter, Ruby on Rails, web application, x509 certificate

Database The database stores most of the configuration and run-time state for a cloud infrastructure. This includes the instance types that are available for use, instances in use, networks available, and projects. Table 4-4 details all the tables in the current Nova database scheme. Table 4-4. Nova Database Schema Table NameDescription migrate_version Stores current version of the database schema as well as other migration-related info. Only used internally and by developers during upgrades. migrations Used for running host-to-host migration. auth_tokens Maps Authorization tokens (for all API transactions) to actual users (via the user id field).

The sync subcommand will upgrade the database scheme for new versions of Nova and the version will report the current version. Note Nova uses a database abstraction library called SQL-Alchemy to interact with its database. A complimentary package called sqlalchemy-migrate is used to manage the database schema. Inspired by Ruby on Rails’ migrations feature, it provides a programmatic way to handle database schema changes. For Nova administrators, this only applies when they are upgrading versions. To upgrade scheme versions, use the nova-manage db sync. This should be rarely used unless you are installing from source or upgrading your installation.

First up is adding the network manager related options to /etc/nova.conf: --network_manager=nova.network.manager.FlatDHCPManager --fixed_range=10.0.0.0/8 --network_size=8 Now, let’s add our network into the database with the nova-manage command: # nova-manage network create 10.0.0.0/8 1 8 Finally, we will create the floating addresses using a range that will not overlap with the StackOps installation. # nova-manage floating create cactus 192.168.1.136/29 With all the /etc/nova.conf changes made, we can create the database schema by running the nova-manage db commands. First, we will sync the database (create the schema), then we will version the database to make sure that the schema creation succeeded. $ sudo nova-manage db sync $ sudo nova-manage db version 14 And we are done with the configuration. All that is left to do now is restart the Nova daemons to pick the new configuration options


pages: 257 words: 64,973

Intrusion Detection With Snort, Apache, Mysql, Php, and Acid by Rafeeq Ur Rehman

Chuck Templeton: OpenTable:, database schema, Free Software Foundation, information security, stealth mode startup, web application

The following command shows the structure of the iphdr table: mysql> describe iphdr; +----------+----------------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------+----------------------+------+-----+---------+-------+ | sid | int(10) unsigned | | PRI | 0 | | | cid | int(10) unsigned | | PRI | 0 | | | ip_src | int(10) unsigned | | MUL | 0 | | | ip_dst | int(10) unsigned | | MUL | 0 | | | ip_ver | tinyint(3) unsigned | YES | | NULL | | | ip_hlen | tinyint(3) unsigned | YES | | NULL | | | ip_tos | tinyint(3) unsigned | YES | | NULL | | | ip_len | smallint(5) unsigned | YES | | NULL | | | ip_id | smallint(5) unsigned | YES | | NULL | | | ip_flags | tinyint(3) unsigned | YES | | NULL | | | ip_off | smallint(5) unsigned | YES | | NULL | | | ip_ttl | tinyint(3) unsigned | YES | | NULL | | | ip_proto | tinyint(3) unsigned | | | 0 | | | ip_csum | smallint(5) unsigned | YES | | NULL | | +----------+----------------------+------+-----+---------+-------+ 14 rows in set (0.00 sec) mysql> For people who want to go into details of how data is stored, database schema provides great information. You can view complete database schema at http://www.incident.org/snortdb/. 5.1.5.1 Creating Extra Tables When you are using some other programs with database and Snort to map service numbers to service names, additional mapping information is needed. For example, TCP port 23 is used for Telnet.

Fortunately you can use the script create_mysql in the contrib directory and it will create all of the necessary tables for you. The contrib directory is present when you download Snort in the source code form from its web site http://www.snort.org and extract its source files. The create_mysql script is present along with other useful stuff in this directory. For example, scripts to create database schema in other types of database servers are also found in this directory. The following command uses this script to create all database tables in the snort database. [root@laptop]# mysql -h localhost -u rr -p snort < contrib/create_mysql Enter password: [root@laptop]# Different command line options are used with this command.

The opt table contains options. The reference and reference_system tables contain information about reference sites used to get more information about a vulnerability. This is the same information that is used inside Snort rules using the ref keyword as discussed in Chapter 3. The schema tables shows the version of database schema. The sensor table contains information about different sensors that are logging data to the Snort database. If there is only one Snort sensor, the table contains only one row. Similarly, the table contains one row for each sensor. The sig_class contains information about different classes of Snort rules as discussed in Chapter 3.


pages: 408 words: 63,990

Build Awesome Command-Line Applications in Ruby: Control Your Computer, Simplify Your Life by David B. Copeland

business logic, Chris Wanstrath, Compatible Time-Sharing System, database schema, en.wikipedia.org, full stack developer, machine readable, Multics, Ruby on Rails, systems thinking, web application

(Recall that we must use bundle exec since we are running out of our source tree for this example; users of todo will, of course, be able to run todo on its own, as discussed in Chapter 9, ​Be Easy to Maintain​.) ​$ bundle exec bin/todo new​ ​Reading new tasks from stdin...​ ​Design database schema​ ​Get access to production logs​ ​Code Review​ ​Implement model objects for new scheme​ ​^D​ ​$ bundle exec bin/todo done 3​ ​$ bundle exec bin/todo list​ ​1 - Design database schema​ ​ Created: Sun Oct 30 12:53:11 -0400 2011​ ​2 - Get access to production logs​ ​ Created: Sun Oct 30 12:53:11 -0400 2011​ ​3 - Code Review​ ​ Created: Sun Oct 30 12:53:11 -0400 2011​ ​ Completed: Sun Oct 30 13:00:05 -0400 2011​ ​4 - Implement model objects for new schema​ ​ Created: Sun Oct 30 12:53:11 -0400 2011​ Let’s enhance the output of list as follows: Show the task name in a brighter/bolder font (the ANSI escape sequences provide two version of each color: a normal color and a brighter/bolder version).

Since the location of the “home directory” is really just shorthand for “whatever directory is in ENV[’HOME’],” we now have test coverage without worrying that our personal task list will be touched. Let’s run our new test and make sure it passes. To prove that we aren’t touching our home directory, we’ll list our actual task list before and after running our feature. ​$ todo list​ ​1 - Design database schema​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​2 - Get access to production logs​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​3 - Code Review​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​$ rake features​ ​Feature: We can add new tasks​ ​ As a busy developer with a lot of things to do​ ​ I want to keep a list of tasks I need to work on​ ​​ ​ Scenario: The task list is in our home directory by default​ ​ Given there is no task list in my home directory​ ​ When I successfully run `todo new 'Some new todo item'`​ ​ Then the task list should exist in my home directory​ ​ When I successfully run `todo list`​ ​ Then the stdout should contain "Some new todo item"​ ​​ ​1 scenarios (1 passed)​ ​5 steps (5 passed)​ ​0m0.793s​ ​$ todo list​ ​1 - Design database schema​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​2 - Get access to production logs​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​3 - Code Review​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ Our test passed, but our task list wasn’t modified—everything worked!

​$ todo list​ ​1 - Design database schema​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​2 - Get access to production logs​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​3 - Code Review​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​$ rake features​ ​Feature: We can add new tasks​ ​ As a busy developer with a lot of things to do​ ​ I want to keep a list of tasks I need to work on​ ​​ ​ Scenario: The task list is in our home directory by default​ ​ Given there is no task list in my home directory​ ​ When I successfully run `todo new 'Some new todo item'`​ ​ Then the task list should exist in my home directory​ ​ When I successfully run `todo list`​ ​ Then the stdout should contain "Some new todo item"​ ​​ ​1 scenarios (1 passed)​ ​5 steps (5 passed)​ ​0m0.793s​ ​$ todo list​ ​1 - Design database schema​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​2 - Get access to production logs​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ ​3 - Code Review​ ​ Created: Sun Oct 02 08:06:12 -0500 2011​ Our test passed, but our task list wasn’t modified—everything worked! Manipulating the environment is a great technique for testing behavior like this; the environment works as a “middleman” that allows us to change things (like the location of the user’s home directory) without affecting the code of our tests or our app.


pages: 757 words: 193,541

The Practice of Cloud System Administration: DevOps and SRE Practices for Web Services, Volume 2 by Thomas A. Limoncelli, Strata R. Chalup, Christina J. Hogan

active measures, Amazon Web Services, anti-pattern, barriers to entry, business process, cloud computing, commoditize, continuous integration, correlation coefficient, database schema, Debian, defense in depth, delayed gratification, DevOps, domain-specific language, en.wikipedia.org, fault tolerance, finite state, Firefox, functional programming, Google Glasses, information asymmetry, Infrastructure as a Service, intermodal, Internet of things, job automation, job satisfaction, Ken Thompson, Kickstarter, level 1 cache, load shedding, longitudinal study, loose coupling, machine readable, Malcom McLean invented shipping containers, Marc Andreessen, place-making, platform as a service, premature optimization, recommendation engine, revision control, risk tolerance, Salesforce, scientific management, seminal paper, side project, Silicon Valley, software as a service, sorting algorithm, standardized shipping container, statistical model, Steven Levy, supply-chain management, systems thinking, The future is already here, Toyota Production System, vertical integration, web application, Yogi Berra

) * * * 11.8 Live Schema Changes Sometimes a new software release expects a different database schema from the previous release. If the service could withstand downtime, one would bring down the service, upgrade the software and change the database schema, and then restart the service. However, downtime is almost always unacceptable in a web environment. In a typical web environment there are many web server frontends, or replicas, that all talk to the same database server. The older software release is unable to understand the new database schema and will malfunction or crash. The newer software release is unable to understand the old database schema and will also malfunction.

The newer software release is unable to understand the old database schema and will also malfunction. Therefore you cannot change the database schema and then do the software upgrade: the older replicas will fail as soon as the database is modified. There will be no service until replicas are upgraded. You cannot upgrade the replicas and then change the database schema because any upgraded replica will fail. The schema cannot be upgraded during the rolling upgrade: new replicas will fail, and then at the moment the database schema changes, all the failing replicas will start to work and the working replicas will start to fail. What chaos! Plus, none of these methods has a decent way to roll back changes if there is a problem.

Flag flips are a technique whereby new features are included in a release but are disabled. A flag or software toggle enables the feature at a designated time. This makes it easy to revert a change if problems are found: simply turn the toggle off. Database schema changes on live systems can be difficult to coordinate. It is not possible to change the software of all clients at the exact same time as the database schema. Instead, the McHenry Technique is used to decouple the changes. New fields are added, software is upgraded to use the old and new fields, software is upgraded to use only the new fields, and any old fields are then removed from the schema.


pages: 360 words: 96,275

PostgreSQL 9 Admin Cookbook: Over 80 Recipes to Help You Run an Efficient PostgreSQL 9. 0 Database by Simon Riggs, Hannu Krosing

business intelligence, business process, database schema, Debian, en.wikipedia.org, full text search, GnuPG, MITM: man-in-the-middle, Skype

After all of these have executed, the databases are fully replicated with slon daemons running. ff 01_create_init.sh creates the bash scripts for later use. You can edit $SLON_ TMP/my_replica.sh and add/remove tables, if needed. So selective replication is a simple matter of editing those files. ff create_struct.sh performs a dump from Master and a restore to the slave of the database schema. This dumps the whole database schema. ff 02_exec_init.sh executes the scripts. This copies the data, so may take a while. Maintaining replication Slony doesn't replicate the following items: ff Table definition changes (DDL commands) ff Changes to users and roles ff Changes to large objects (BLOBS) ff Changes made by the TRUNCATE command (may change in future) So, once you've set up Slony, you'll need to make any further changes to the replicated tables onto each of the nodes in the configuration.

A recommended minimal log_line_prefix format string for auditing DDL is '%t %u %d', which tells postgresql to log timestamp, database user, and database name at the start of every log line. Can't I find out this information from the database If you don't have logging enabled, or don't have all the logs, then you can get only very limited information on who changed the database schema and when from system tables, and even that is not reliable. What you can can get is the "owner" of the database object (table, sequence, function, and so on) but this may have been changed by "ALTER TABLE … SET OWNER to yyyy". You may be able to guess the approximate time of object creation or of the latest modification by looking up the transaction id in xmin system column in pg_class and pg_attrib system tables.

HOT updates not only avoid creating new index entries but also can perform a fast mini-vacuum inside the page to make room for new rows. ALTER TABLE t1 SET (fillfactor = 70); Tells PostgreSQL to fill only 70% of each page in table t1 when doing inserts, so that 30% is left for use by in-page (HOT) updates. Rewriting the schema—a more radical approach In some occasions, it may make sense to rewrite the database schema, and provide an old view for unchanged queries using views, triggers, rules, and functions. One such case occurs when refactoring the database, and you want old queries to keep running while changes are made. Another is an external application that is unusable with the provided schema, but can be made to perform OK with a different distribution of data between tables. 259 Performance & Concurrency Why is my query not using an index?


pages: 355 words: 81,788

Monolith to Microservices: Evolutionary Patterns to Transform Your Monolith by Sam Newman

Airbnb, business logic, business process, continuous integration, Conway's law, database schema, DevOps, fail fast, fault tolerance, ghettoisation, inventory management, Jeff Bezos, Kubernetes, loose coupling, microservices, MVC pattern, price anchoring, pull request, single page application, single source of truth, software as a service, source of truth, sunk-cost fallacy, systems thinking, telepresence, two-pizza team, work culture

This is clearly not a great solution to a problem we shouldn’t have had in the first place, but it worked—mostly. However, we soon realized that most of these applications weren’t undergoing active maintenance, meaning there was no chance that they would be updated to reflect a new schema design.2 In effect, our database schema had become a public-facing contract that couldn’t change: we had to maintain that schema structure going forward. Views to Present Our solution was to first resolve those situations where external systems were writing into our schema. Luckily, in our case they were easy to resolve. For all those clients who wanted to read data, though, we created a dedicated schema hosting views that looked like the old schema, and had clients point at that schema instead, as Figure 4-3 shows.

Well, fundamentally, logical and physical decomposition achieve different goals. Logical decomposition allows for simpler independent change and information hiding, whereas physical decomposition potentially improves system robustness, and could help remove resource contention allowing for improved throughput or latency. When we logically decompose our database schemas but keep them on the same physical database engine, as in Figure 4-24, we have a potential single point of failure. If the database engine goes down, both services are affected. However, the world isn’t that simple. Many database engines have mechanisms to avoid single points of failure, such as multiprimary database modes, warm failover mechanisms, and the like.

In the end, they settled on a model in which the new revenue functionality was effectively deployed as a single service, containing three isolated bounded contexts (each of which ended up as separate JAR files), as shown in Figure 4-28. Figure 4-28. Each bounded context in the Revenue service had its own separate database schema, allowing separation later on Each bounded context had its own, totally separate databases. The idea was that if there was a need to separate them into microservices later, this would be much easier. However, it turned out that this was never needed. Several years later, this revenue service remains as it is, a monolith with multiple associated databases—a great example of a modular monolith.


Programming Android by Zigurd Mednieks, Laird Dornin, G. Blake Meike, Masumi Nakamura

anti-pattern, business process, conceptual framework, create, read, update, delete, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, general purpose technology, Google Earth, interchangeable parts, iterative process, loose coupling, MVC pattern, revision control, RFID, SQL injection, systems thinking, web application

Note The tables created by SQL CREATE TABLE statements and the attributes they contain are called a database schema. DROP TABLE This removes a table added with the CREATE TABLE statement. It takes the name of the table to be deleted. On completion, any data that was stored in the table may not be retrieved. Here is some SQL code that will create and then delete a simple table for storing contacts: CREATE TABLE contacts ( first_name TEXT, last_name TEXT, phone_number TEXT, height_in_meters REAL); DROP TABLE contacts; When entering commands through sqlite3, you must terminate each command with a semicolon. You may change the database schema after you create tables (which you may want to do to add a column or change the default value of a column) by entering the ALTER TABLE command.

Detailed information on Monkey can be found at http://developer.android.com/guide/developing/tools/monkey.html. sqlite3 Android uses SQLite as the database system for many system databases and provides APIs for applications to make use of SQLite, which is convenient for data storage and presentation. SQLite also has a command-line interface, and the sqlite3 command enables developers to dump database schemas and perform other operations on Android databases. These databases are, of course, in an Android device, or they are contained in an AVD, and therefore the sqlite3 command is available in the adb shell. Detailed directions for how to access the sqlite3 command line from inside the adb shell are available at http://developer.android.com/guide/developing/tools/adb.html#shellcommands.

This may help you to avoid orphaned references, also known as enforcement of foreign keys. However, the foreign key constraint in SQLite is optional, and is turned off in Android. As of Android 2.2, you cannot rely on a foreign key constraint to catch incorrect foreign key references, so you will need to take care when creating database schemas that use foreign keys. There are several other constraints with less far-reaching effects: UNIQUE Forces the value of the given column to be different from the values in that column in all existing rows, whenever a row is inserted or updated. Any insert or update operation that attempts to insert a duplicate value will result in an SQLite constraint violation.


Refactoring: Improving the Design of Existing Code by Martin Fowler, Kent Beck, John Brant, William Opdyke, Don Roberts

business logic, conceptual framework, database schema, index card, MVC pattern, no silver bullet, place-making, sorting algorithm

As we learn more about refactoring, we will come up with more solutions to problems and learn about what problems are difficult to solve. Databases One problem area for refactoring is databases. Most business applications are tightly coupled to the database schema that supports them. That's one reason that the database is difficult to change. Another reason is data migration. Even if you have carefully layered your system to minimize the dependencies between the database schema and the object model, changing the database schema forces you to migrate the data, which can be a long and fraught task. With nonobject databases a way to deal with this problem is to place a separate layer of software between your object model and your database model.

Whereas applying (and possibly backing out) a series of incremental refactorings is fairly natural for Smalltalk and CLOS, the cost per iteration (in terms of recompilation and retesting) is higher for C++ programs; thus programmers tend to be less willing to make these small changes. Many applications use a database. Changes to the structure of objects in a C++ program may require that corresponding schematic changes be made to the database. (Many of the ideas I applied in my refactoring work came from research into object-oriented database schema 317 evolution.) Another limitation, which may interest software researchers more than many software practitioners, is that C++ does not provide support for metalevel program analysis and change. There is nothing analogous to the metaobject protocol available for CLOS. The metaobject protocol of CLOS, for example, supports a sometimes useful refactoring for changing selected instances of a class into instances of a different class while having all references to the old objects automatically point to the new objects.

There are many other checks, most as simple as this, some more complex. In my research, I defined safety in terms of program properties (related to activities such as scoping and typing) that need to continue to hold after refactoring. Many of these program properties are similar to integrity constraints that must be maintained when database schemas change.[17] Each refactoring has associated with it a set of necessary preconditions that if true would ensure that the program properties are preserved. Only if the tool were to determine that everything is safe would the tool perform the refactoring. Fortunately, determining whether a refactoring is safe often is trivial, especially for the low-level refactorings that constitute most of our refactoring.


Beautiful Visualization by Julie Steele

barriers to entry, correlation does not imply causation, data acquisition, data science, database schema, Drosophila, en.wikipedia.org, epigenetics, global pandemic, Hans Rosling, index card, information retrieval, iterative process, linked data, Mercator projection, meta-analysis, natural language processing, Netflix Prize, no-fly zone, pattern recognition, peer-to-peer, performance metric, power law, QR code, recommendation engine, semantic web, social bookmarking, social distancing, social graph, sorting algorithm, Steve Jobs, the long tail, web application, wikimedia commons, Yochai Benkler

So, I wrote some scripts to extract only the part of the data I needed for my visualization and stored it in a SQLite database. The schema is shown in Figure 8-1. In the interests of simplicity, I assigned a party based on only the most recent <role>, a decision that would come back to haunt me. Figure 8-1. Simple database schema for representing the raw data required for the visualization Computing the Voting Affinity Matrix With the raw data munged into a more pliant format, I was ready to tackle the problem of computing the affinities that would represent the edges in the graph. This entailed building an affinity matrix (Figure 8-2) that tallied the number of times different senators voted the same way on the same bills.

Most databases have a special cube or rollup keyword to create an aggregation from a regular table. This has the advantage that no special processing is needed beforehand, but the disadvantage of being slower and requiring more disk space to store all the original values. Structuring the data specifically for fast read and aggregation performance (as is done in data warehouses and our database schema) considerably speeds up the most common operation at the expense of more processing being required when new data is added. While the ParSets program does not currently show numerical dimensions, it does store them in the database. They are stored in a separate table, containing the key of the row the values correspond to and one column per numerical dimension.

For instance, it was assumed that we might want to split winners by country (and compare this data to the overall submission statistics) and look at the relationships between authors and categories. Figure 13-1. Matrix of initial interest in attribute combinations Understanding the Data Situation Next, I began to look into the available data, together with Sandor Herramhof. Over the years, a number of database schemas with different conventions and varying degrees of modeling detail had been used, which made it very difficult to get an early overview of the existing data. For instance, one database featured additional information stored in an XML format inside a text field, but only for some of the submissions.


pages: 313 words: 75,583

Ansible for DevOps: Server and Configuration Management for Humans by Jeff Geerling

Abraham Maslow, AGPL, Amazon Web Services, cloud computing, continuous integration, database schema, Debian, defense in depth, DevOps, fault tolerance, Firefox, full text search, Google Chrome, inventory management, loose coupling, microservices, Minecraft, MITM: man-in-the-middle, punch-card reader, Ruby on Rails, web application

Add a line to the main.yml file to include a new deploy.yml playbook: 1 --- 2 - include: provision.yml 3 - include: deploy.yml Now we’re ready to create a the deploy.yml playbook, which will do the following: Use git to check out the latest production release of the Rails app. Copy over a secrets.yml template that holds some secure app data required for running the app. Make sure all the gems required for the app are installed (via Bundler). Create the database (if it doesn’t already exist). Run rake tasks to make sure the database schema is up-to-date and all assets (like JS and CSS) are compiled. Make sure the app files’ ownership is set correctly so Passenger and Nginx can serve them without error. If any changes or updates were made, restart Passenger and Nginx. Most of these tasks will use Ansible’s modules, but for a few, we’ll just wrap the normal deployment-related commands in shell since there aren’t pre-existing modules to take care of them for us: 1 --- 2 - hosts: all 3 sudo: yes 4 5 vars_files: 6 - vars.yml 7 8 roles: 9 - geerlingguy.passenger 10 11 tasks: 12 - name: Ensure demo application is at correct release. 13 git: 14 repo: https://github.com/geerlingguy/demo-rails-app.git 15 version: "{{ app_version }}" 16 dest: "{{ app_directory }}" 17 accept_hostkey: true 18 register: app_updated 19 notify: restart nginx 20 21 - name: Ensure secrets file is present. 22 template: 23 src: templates/secrets.yml.j2 24 dest: "{{ app_directory }}/config/secrets.yml" 25 owner: "{{ app_user }}" 26 group: "{{ app_user }}" 27 mode: 0664 28 notify: restart nginx 29 30 - name: Install required dependencies with bundler. 31 shell: "bundle install --path vendor/bundle chdir={{ app_directory }}" 32 when: app_updated.changed == true 33 notify: restart nginx 34 35 - name: Check if database exists. 36 stat: "path={{ app_directory }}/db/{{ app_environment.RAILS_ENV }}.sqlite3" 37 register: app_db_exists 38 39 - name: Create database. 40 shell: "bundle exec rake db:create chdir={{ app_directory }}" 41 when: app_db_exists.stat.exists == false 42 notify: restart nginx 43 44 - name: Perform deployment-related rake tasks. 45 shell: "{{ item }} chdir={{ app_directory }}" 46 with_items: 47 - bundle exec rake db:migrate 48 - bundle exec rake assets:precompile 49 environment: app_environment 50 when: app_updated.changed == true 51 notify: restart nginx 52 53 - name: Ensure demo application has correct user for files. 54 file: 55 path: "{{ app_directory }}" 56 state: directory 57 owner: "{{ app_user }}" 58 group: "{{ app_user }}" 59 recurse: yes 60 notify: restart nginx The first thing you’ll notice (besides the fact that we’ve included the vars.yml file again, since we need those variables in this playbook as well) is that we’ve added the geerlingguy.passenger role in this playbook.

Since this application uses a simple SQLite database, it’s a matter of checking if the .sqlite3 file exists, and if not, running the db:create task. (Lines 44-51) If the app_updated variable shows that a change occurred as part of the first git task, we’ll also run a couple rake tasks to make sure the database schema is up to date, and all assets (like scripts and stylesheets) are compiled. (Lines 53-60) Make sure all app files have the correct permissions for Passenger/Nginx to serve them correctly. Because many of the tasks result in filesystem changes that could change the behavior of the application, they all notify the restart nginx handler provided by the geerlingguy.passenger role, so Passenger reloads the configuration and restarts the app.

This way the entire state of your infrastructure is encapsulated in your playbook files, which ideally should be version controlled and managed similarly to the application they deploy. Plus, who wants to enter any more information on the command line than is absolutely required? Our application is a fairly generic web application that has updates to application code (which require a webserver reload), styles (which need recompiling), and possibly the database schema (which needs rake migrate tasks to be run). Any time app_version is changed inside playbooks/vars.yml, the deploy playbook will automatically run all the required tasks to get our app running with the latest code. Update app_version to 1.3.0, and then run the following command again: $ ansible-playbook deploy.yml -i inventory-ansible After a minute or so, the deployment should complete, and once that’s done, you’ll see the much improved new version of the Demonstration Ruby on Rails Application: Rails app - version 1.3.0 with a responsive UI.


PostgreSQL: Up and Running, 3rd Edition by Unknown

cloud computing, database schema, full text search, job automation, platform as a service, profit maximization, Salesforce, SQL injection, web application

Figure 1-2. phpPgAdmin Adminer If you manage other databases besides PostgreSQL and are looking for a unified tool, Adminer might fit the bill. Adminer is a lightweight, open source PHP application with options for PostgreSQL, MySQL, SQLite, SQL Server, and Oracle, all delivered through a single interface. One unique feature of Adminer we’re impressed with is the relational diagrammer that can produce a schematic layout of your database schema, along with a linear representation of foreign key relationships. Another hasslereducing feature is that you can deploy Adminer as a single PHP file. Figure 1-3 is a screenshot of the login screen and a snippet from the diagrammer output. Many users stumble in the login screen of Adminer because it doesn’t include a separate text box for indicating the port number.

This book is not even going to attempt to describe all that you’ll find in a standard PostgreSQL install. With PostgreSQL churning out features at breakneck speed, we can’t imagine any book that could possibly do this. We limit our quick overview to those objects that you should be familiar with. Databases Each PostgreSQL service houses many individual databases. Schemas Schemas are part of the ANSI SQL standard. They are the immediate next level of organization within each database. If you think of the database as a country, schemas would be the individual states (or provinces, prefectures, or departments, depending on the country.) Most database objects first belong to a schema, which belongs to a database.

It can even export results as HTML, providing you with a turn-key reporting engine, albeit a bit crude. Backup and restore wizard Can’t remember the myriad of commands and switches to perform a backup or restore using pg_restore and pg_dump? pgAdmin has a nice interface that lets you selectively back up and restore databases, schemas, single tables, and globals. You can view and copy the underlying pg_dump or pg_restore command that pgAdmin used in the Message tab. Grant wizard This time-saver allows you to change privileges on many database objects in one fell swoop. pgScript engine This is a quick-and-dirty way to run scripts that don’t have to complete as a transaction.


pages: 283 words: 78,705

Principles of Web API Design: Delivering Value with APIs and Microservices by James Higginbotham

Amazon Web Services, anti-pattern, business intelligence, business logic, business process, Clayton Christensen, cognitive dissonance, cognitive load, collaborative editing, continuous integration, create, read, update, delete, database schema, DevOps, fallacies of distributed computing, fault tolerance, index card, Internet of things, inventory management, Kubernetes, linked data, loose coupling, machine readable, Metcalfe’s law, microservices, recommendation engine, semantic web, side project, single page application, Snapchat, software as a service, SQL injection, web application, WebSocket

A Word of Caution on Resource Identification While it may be tempting to use a database schema as the starting point for resource identification, keep in mind that API designs should not leak internal implementation details. A database schema reflects optimizations for transactional read and write operations rather than exposing business domain concepts over a network API. It is best to work top-down when modeling an API to avoid leaking internal data model decisions into the API design. If the implementation phase demonstrates a one-to-one relationship between resources and a database schema, then it becomes a “happy accident”, as painter Bob Ross would say.

Relational databases are still heavily used, but additional choices including document and columnar databases have gained recent popularity. While APIs may have an impact on the data model, an API design should evolve independently from the latest database trends. What happens when teams expose a data model as an API? Constant code changes: Database schema changes will result in a constantly changing API, as the API must keep in lockstep with the underlying database. This change to the data model will force consumers into a complex conformist relationship in which they must rewrite their API integration code every time the underlying data model changes.


Django Book by Matt Behrens

Benevolent Dictator For Life (BDFL), book value, business logic, create, read, update, delete, database schema, distributed revision control, don't repeat yourself, duck typing, en.wikipedia.org, Firefox, full text search, loose coupling, MITM: man-in-the-middle, MVC pattern, revision control, Ruby on Rails, school choice, slashdot, SQL injection, web application

Middleware modules running inside it (coming later in the stack) will be under the same transaction control as the view functions. See Appendix B for more about information about database transactions. What’s Next? Web developers and database-schema designers don’t always have the luxury of starting from scratch. In the next chapter, we’ll cover how to integrate with legacy systems, such as database schemas you’ve inherited from the 1980s. © Copyright 2012, Matt Behrens. Created using Sphinx 1.2.2. index next | previous | Django Book 0.1 documentation » Chapter 18: Integrating with Legacy Databases and Applications Django is best suited for so-called green-field development – that is, starting projects from scratch, as if you were constructing a building on a fresh field of green grass.

It looks at all of the models in each app in your INSTALLED_APPS setting, checks the database to see whether the appropriate tables exist yet, and creates the tables if they don’t yet exist. Note that syncdb does not sync changes in models or deletions of models; if you make a change to a model or delete a model, and you want to update the database, syncdb will not handle that. (More on this in the “Making Changes to a Database Schema” section in Chapter 10.) If you run python manage.py syncdb again, nothing happens, because you haven’t added any models to the books app or added any apps to INSTALLED_APPS. Ergo, it’s always safe to run python manage.py syncdb – it won’t clobber things. If you’re interested, take a moment to dive into your database server’s command-line client and see the database tables Django created.

Here’s the revised code: class Book(models.Model): title = models.CharField(max_length=100) authors = models.ManyToManyField(Author) publisher = models.ForeignKey(Publisher) publication_date = models.DateField(blank=True, null=True) Adding null=True is more complicated than adding blank=True, because null=True changes the semantics of the database – that is, it changes the CREATE TABLE statement to remove the NOT NULL from the publication_date field. To complete this change, we’ll need to update the database. For a number of reasons, Django does not attempt to automate changes to database schemas, so it’s your own responsibility to execute the appropriate ALTER TABLE statement whenever you make such a change to a model. Recall that you can use manage.py dbshell to enter your database server’s shell. Here’s how to remove the NOT NULL in this particular case: ALTER TABLE books_book ALTER COLUMN publication_date DROP NOT NULL; (Note that this SQL syntax is specific to PostgreSQL.)


pages: 205 words: 47,169

PostgreSQL: Up and Running by Regina Obe, Leo Hsu

cloud computing, database schema, Debian, en.wikipedia.org, full text search, Salesforce, SQL injection, web application

What is unique about PostgreSQL tables is the inheritance support and the fact that every table automatically begets an accompanying custom data type. Tables can inherit from other tables and querying can bring up child records from child tables. schema Schemas are part of the ANSI-SQL standards, so you’ll see them in other databases. Schemas are the logical containers of tables and other objects. Each database can have multiple schemas. tablespace Tablespace is the physical location where data is stored. PostgreSQL allows tablespaces to be independently managed, which means you can easily move databases to different drives with just a few commands.

It can even export as HTML, providing you with a turn-key reporting engine, albeit a bit crude. Backup and restore wizard. Can’t remember the myriad of commands and switches to perform a backup or restore using pg_restore and pg_dump? pgAdmin has a nice interface that’ll let you selectively back up and restore databases, schemas, single tables, and globals, and the message tab shows you the command line pg_dump or pg_restore it used to do it. Grant Wizard. This time-saver will allow you to change permissions on many database objects in one fell swoop. pgScript engine. This is a quick and dirty way to run scripts that don’t have to complete as a transaction.

Selective Backup of Database Objects pgAdmin provides a graphical interface to pg_dump that we covered in Selective Backup Using pg_dump for doing selective backup of objects. To back up selective objects right-mouse click on the object you want to back up and select Backup .... You can back up a whole database, schema, table, or anything else. Figure 4-11. PgAdmin Right-click Backup Schema If all you wanted to back up was that one object, you can forgo the other tabs and just do as we did in Figure 4-10. However, you can selectively pick or uncheck some more items by clicking on the objects tab as shown in Figure 4-12.


Data Mining: Concepts and Techniques: Concepts and Techniques by Jiawei Han, Micheline Kamber, Jian Pei

backpropagation, bioinformatics, business intelligence, business process, Claude Shannon: information theory, cloud computing, computer vision, correlation coefficient, cyber-physical system, database schema, discrete time, disinformation, distributed generation, finite state, industrial research laboratory, information retrieval, information security, iterative process, knowledge worker, linked data, machine readable, natural language processing, Netflix Prize, Occam's razor, pattern recognition, performance metric, phenotype, power law, random walk, recommendation engine, RFID, search costs, semantic web, seminal paper, sentiment analysis, sparse data, speech recognition, statistical model, stochastic process, supply-chain management, text mining, thinkpad, Thomas Bayes, web application

Figure 3.12 A concept hierarchy for the attribute price, where an interval ($X … $Y] denotes the range from $X (exclusive) to $Y (inclusive). 6. Concept hierarchy generation for nominal data, where attributes such as street can be generalized to higher-level concepts, like city or country. Many hierarchies for nominal attributes are implicit within the database schema and can be automatically defined at the schema definition level. Recall that there is much overlap between the major data preprocessing tasks. The first three of these strategies were discussed earlier in this chapter. Smoothing is a form of data cleaning and was addressed in Section 3.2.2.

Nominal attributes have a finite (but possibly large) number of distinct values, with no ordering among the values. Examples include geographic_location, job_category, and item_type. Manual definition of concept hierarchies can be a tedious and time-consuming task for a user or a domain expert. Fortunately, many hierarchies are implicit within the database schema and can be automatically defined at the schema definition level. The concept hierarchies can be used to transform the data into multiple levels of granularity. For example, data mining patterns regarding sales may be found relating to specific regions or countries, in addition to individual branch locations.

Consequently, the user may have included only a small subset of the relevant attributes in the hierarchy specification. For example, instead of including all of the hierarchically relevant attributes for location, the user may have specified only street and city. To handle such partially specified hierarchies, it is important to embed data semantics in the database schema so that attributes with tight semantic connections can be pinned together. In this way, the specification of one attribute may trigger a whole group of semantically tightly linked attributes to be “dragged in” to form a complete hierarchy. Users, however, should have the option to override this feature, as necessary.


pages: 725 words: 168,262

API Design Patterns by Jj Geewax

Amazon Web Services, anti-pattern, bitcoin, blockchain, business logic, cognitive load, continuous integration, COVID-19, database schema, en.wikipedia.org, exponential backoff, imposter syndrome, Internet of things, Kubernetes, lateral thinking, loose coupling, machine readable, microservices, natural language processing, Paradox of Choice, ride hailing / ride sharing, social graph, sorting algorithm

Who should read this book API Design Patterns is for anyone who is building or plans to build a web API, particularly when that API will be exposed to the public. Familiarity with some serialization formats (e.g., JSON, Google Protocol Buffers, or Apache Thrift) or common storage paradigms (e.g., relational database schemas) is certainly nice to have but is in no way required. It’s also a nice bonus if you’re already familiar with HTTP and its various methods (e.g., GET and POST), as it is the transport of choice throughout the examples in this book. If you find yourself designing an API and running into problems and thinking, “I’m sure someone must have figured this out already,” this book is for you.

For example, if we were to build an API for a chat room, the resource layout refers to the choices we make that might result in a ChatRoom resource along with a User resource that might associate somehow with a room. This manner, in which User and ChatRoom resources associate with one another, is what we’re interested in. If you’ve ever designed a relational database with various tables, this should feel familiar: the database schema you design is often very similar in nature to how the API is represented. While it might be tempting to say that the relationship is all that matters, it’s actually a bit more involved than this. While the relationship itself is the only thing with a direct influence in the resource layout we end up with, there are many other factors that influence the layout indirectly.

Let’s start by looking at an important question: do we need a relationship at all? 4.2.1 Do you need a relationship at all? When building an API, after we’ve chosen the list of things or resources that matter to us, the next step is to decide how these resources relate to one another. Sort of like normalizing a database schema, it can often be tempting to connect everything that might ever need to be connected in order to build a rich web of interconnected resources that we can easily navigate through a variety of pointers and references. While this will certainly lead to a very rich and descriptive API, it can also become untenable and run into serious performance degradation as the API receives more and more traffic and stores more and more data.


Software Design Decoded: 66 Ways Experts Think by Marian Petre, André van Der Hoek, Yen Quach

database schema, lone genius

When the situation calls for it, they employ much more formal diagrams than the sketches they typically create. To model certain phenomena more clearly—with more precision and completeness—they may well work out a state machine in all of its detail, edit pseudocode on a whiteboard, or meticulously specify all of the entities, fields, keys, and relationships in a database schema. Once done, however, they will quickly return to sketching, employing a more informal style that abandons much notational detail. 27 Experts invent notations Experts choose a notation that suits the problem, even if the notation does not exist. New notations arise when, in the heat of design, shorthand symbols are used that take on a meaning of their own.


RDF Database Systems: Triples Storage and SPARQL Query Processing by Olivier Cure, Guillaume Blin

Amazon Web Services, bioinformatics, business intelligence, cloud computing, database schema, fault tolerance, folksonomy, full text search, functional programming, information retrieval, Internet Archive, Internet of things, linked data, machine readable, NP-complete, peer-to-peer, performance metric, power law, random walk, recommendation engine, RFID, semantic web, Silicon Valley, social intelligence, software as a service, SPARQL, sparse data, web application

., literals at the object position.The fact that a given URI can be repeated multiple times in a data set and that URIs are generally long strings of characters raises some memory footprint issues. Therefore, the use of efficient dictionaries—encoding URIs with identifiers that are used for the representation of triples—is crucial in triple stores, while in relational models a suitable database schema minimizes such data redundancy. The second feature is that a data management system requires a language to query the data it contains. For this purpose, the Semantic Web group at the W3C has published the SPARQL Protocol and RDF Query Language (SPARQL) recommendation; one of its main aspects is to support a query language.

At least three out of four NoSQL system families adopt this approach, and they are mainly differentiated by the type of information they can store at the value position.The other quarter of NoSQL systems rely on the graph model and have a set of totally different properties and use cases. An important aspect of the application domains motivating the emergence of NoSQL stores was the need for flexibility in terms of database schema. In fact, the designers of these systems pushed this characteristic to its logical limit by not providing facilities to define a schema. This schemaless property facilitates an important feature needed in many data management approaches: the integration of novel data into the database. For example, consider a relation where one can add as many attributes as needed.

Moreover, if the RDF data set is associated to an RDFS or OWL ontology, then the possibility of cleverly handling query processing is increased. In this chapter, we do not consider deep reasoning mechanisms, but defer that presentation to Chapter 8. Nonetheless, the expressivity of Semantic Web ontologies supports some query simplifications that cannot be performed using the metadata contained in a relational database schema. For instance, we present a semantic simplification of a SPARQL basic graph pattern (BGP), a set of triples in a WHERE clause, that can be integrated in the rewriting component. Most of the queries we are dealing with in this chapter involve selections, or SPARQL queries with SELECT, ASK, DESCRIBE, or CONSTRUCT clauses.


pages: 1,409 words: 205,237

Architecting Modern Data Platforms: A Guide to Enterprise Hadoop at Scale by Jan Kunigk, Ian Buss, Paul Wilkinson, Lars George

Amazon Web Services, barriers to entry, bitcoin, business intelligence, business logic, business process, cloud computing, commoditize, computer vision, continuous integration, create, read, update, delete, data science, database schema, Debian, deep learning, DevOps, domain-specific language, fault tolerance, Firefox, FOSDEM, functional programming, Google Chrome, Induced demand, information security, Infrastructure as a Service, Internet of things, job automation, Kickstarter, Kubernetes, level 1 cache, loose coupling, microservices, natural language processing, Network effects, platform as a service, single source of truth, source of truth, statistical model, vertical integration, web application

For example, newer versions of Hive or Hue may need to update their internal database schemas. Since the recommended way of dealing with Hadoop is having multiple environments (see Chapter 2), you can stage the service database-related schema updates first and then roll them out, as supported by each Hadoop subsystem. For example, Oozie can be shut down briefly to update the software and database schemas, without interrupting data pipelines that are in flight. Instead of manually dealing with these maintenance tasks, the Hadoop management systems support the automated upgrade of components and their internal database schemas. Therefore, you can simply press a button, wait, and see all of the systems return to operational again.

Most analysts are able to process and explore data in statistical or scientific programming languages such as R or Python. For scalable productive BI reports, the analyst uses a range of algorithms on multidimensional views on the data in tabular form (also known as cubes) based on star or snowflake database schemas in the data warehouse. This achieves a high degree of abstraction for the analyst, who focuses on the data algorithms that often have been optimized for cubes in the BI solution. Software developer Software developers not only write, build, and maintain code, but also are involved with the continuous integration and operation of the underlying infrastructure.

scaling clusters up and down, automating in cloud deployments, Scaling Up and Down scaling data warehouses, The Traditional Approach scaling networks, Modular switches selecting instances for, Instances spine-leaf networks, Scalability scaling, Big Data Technology Primer schemasHive, Service Databases schema on read, Apache Hive Metastore schemaless mode for Solr collections, Apache Solr updating for service databases, Database Considerations schemes (ZooKeeper), ZooKeeper scopes (HBase), HBase SCR (see short-circuit reads) SCSI SYNCHRONIZE CACHE command, Disk cache SDN (see software-defined networking) search bind (LDAP, LDAP Authentication searches, Big Data Technology Primer, Storage Enginesin Solr, Apache Solr secret keys for Amazon S3, Amazon Simple Storage Service Secure Sockets Layer (SSL) (see Transport Layer Security) security, Security-Summaryaccess security for clusters, Access Security at-rest encryption, At-Rest Encryption-Encrypting Temporary FilesHDFS Transparent Data Encryption, HDFS Transparent Data Encryption-KMS implementations options for Hadoop services and data, At-Rest Encryption volume encryption with Cloudera NE and KTS, Volume Encryption with Cloudera Navigator Encrypt and Key Trustee Server authentication, Authentication-Impersonationdelegation tokens, Delegation Tokens impersonation, Impersonation Kerberos, Kerberos LDAP, LDAP Authentication authorization, Authorization-Sentrycentralized security management, Centralized Security Management group resolution, Group Resolution Hadoop service level authorization, Hadoop Service Level Authorization HBase, HBase HDFS, HDFS Hive, Hive Hue, Hue Kafka, Kafka Kudu, Kudu Sentry, Sentry Solr, Solr superusers and supergroups, Superusers and Supergroups-Supergroups YARN, YARN ZooKeeper, ZooKeeper cluster security and backups, Hadoop Cluster Backups configuring for high availability in Hue, Deployment options configuring for high availability in Oozie, Deployment considerations considerations when using load balancing, Security considerations disabling for platform benchmarks, Hadoop Validation HDFS and high availability, Security in cloud auto provisioning of clusters, Deploying with Security-TLSintegrating with Kerberos KDC, Integrating with a Kerberos KDC TLS, TLS in OpenShift containers, Isolation in public cloud solutions, Key Things to Know in the cloud, Security in the Cloud-Summaryassessing the risk, Assessing the Risk-Assessing the Risk auditing, Auditing encryption for data at rest, Encryption for Data at Rest-Recommendations and Summary for Cloud Encryption identity provider options for Hadoop, Identity Provider Options for Hadoop-Option C: On-Premises ID Services object storage security and Hadoop, Object Storage Security and Hadoop-Auditing perimeter controls and firewalling, Perimeter Controls and Firewalling-Summary risk model, Risk Model in-flight encryption, In-Flight Encryption-Authenticationenabling, Enabling in-Flight Encryption SASL quality of protection, SASL Quality of Protection TLS encryption, TLS Encryption internet-facing clusters, Layer 1 Recommendations LDAP, LDAP Security operating system security adjustments, OS Configuration for Hadoop Security Assertion Markup Language (SAML), Authentication, Hue security groups, AWSnetwork security groups, Azure Security-Enhanced Linux (SELinux)disabling, OS Configuration for Hadoop in OpenShift private cloud, Isolation seek system call, Important System Calls segmentation fault (Linux), Nonuniform Memory Access self-signed certificates, Certificate Management, TLSuse cases, Certificate Management Sentry, Required Databases, Other Servicesapplying HDFS ACLs, HDFS authorization in, Sentry backing up Apache Sentry, Apache Sentry centralized authorization control for Kafka, Kafka centralized authorization with, Centralized Security Management protecting all user interaction with Hive, Hive protection for reads/writes to Kudu tables through Impala integration, Kudu used for Kafka authorization, Deployment considerations using for Impala authorization, Impala using in Solr, Solr separate database mode, Database Integration Options server form factors, Server Form Factors-Guidancecomparison of, Form Factor Comparisonconfigurations and price sampling, Form Factor Comparison guidelines for choosing, Guidance Server Message (SMB) storage, Network-attached storage server-server traffic pattern, Scalability server-side encryption (SSE), Server-side and client-side encryptionsupport by Amazon S3, Encryption in AWS serversbootstrapping, Operating Systems certificates, TLS commodity, Commodity Servers-Server CPUs and RAM for ZooKeeper deployments, Deployment considerations optimized configurations, Optimized Server Configurations server CPUs and RAMcores and threads in Hadoop, Threads and cores in Hadoop role of x86 architecture, The role of the x86 architecture service accountsaccess in Azure Data Lake Store, ADLS in GCP Cloud Storage, GCP Cloud Storage, Service account service databases, Service Databases-Database Integration Optionsintegration options, Database Integration Options required databases, Required Databases service edge nodes, Interaction Patterns service level authorization, Authorization, Hadoop Service Level Authorization service principal name (SPN), Layer 3 Recommendations service principals (Kerberos), Principals service proxies, Service proxies service tags, Azure service tickets (Kerberos), Kerberos service-level agreements (SLAs), Multiple Clusters for Workload Isolation, Measuring Availability, Policies and Objectives service-level objectives (SLOs), Measuring Availability servicesavailability in public cloud solutions, Service availability high availability, Lateral/Service HA resource contention between cluster services, Multiple Clusters for Workload Isolation separation of identical service roles, Separation of Identical Service Roles sessions, LDAP Authenticationsession persistence, Session persistencefor Hue, Hue in Impala, Architecting for HA load balancer with, Deployment considerations shared access signatures (SASs), Blob storage Shared Data Experience (SDX), Multiple Clusters and Independent Storage and Compute shared database mode, Database Integration Options shells (Linux), Command-Line Access short-circuit reads (HDFS), Requirements for Multitenancy, Short-Circuit and Zero-Copy Readsbenefits of, Short-Circuit and Zero-Copy Reads short-lived (or transient) life cycle, Cluster Life Cycle Models Shortest Path Bridging (SPB), Implementation shuffles, Shuffles Simple and Protected GSSAPI Negotiation Mechanism (SPNEGO), Kerberos over HTTP, Hue Simple Network Management Protocol (SNMP) traps, Monitoring Simple Storage Service (S3) (see Amazon Simple Storage Service) SimpleAWSCredentialsProvider, Persistent credentials simultaneous multithreading (SMT), Threads and cores in Hadoop single endpoint interaction pattern, Interaction Patterns single points of failure (SPOFs), Consensus, High Availability Definedusing a single load balancer, Dedicated load balancers single sign-on (SSO) technologies, Authenticationsupport by Hue, Hue single switch network architecture, Single switchimplementation, Implementation small cluster configurations, Small Cluster Configurations Small Computer Systems Interface (SCSI) disk blocks, SANsiSCSI, SANs SMART, Disk health smoke testing, Platform Validation SMP (symmetric multiprocessing), Threads and cores in Hadoop SMT (simultaneous multithreading), Threads and cores in Hadoop snapshots, Snapshotsin HBase, HBase, Subflow: HBase of Amazon S3 object storage, AWS storage options SOCKS proxies, SOCKS proxies, YARN softwarein datacenters, effects on ingest and extraction speed, Software load balancers, Dedicated load balancers software configuration management (SCM) tools, Operating Systemsusing in platform validation testing, Useful Tools using to automate installation of Hadoop binaries, Installation Choices software developers, Software developer software license efficiency, Server Form Factors software overlay networks, Network Virtualization software-defined networking (SDN), Network Virtualization-Network Virtualization, Solutions for Private Cloudsimportance for Hadoop, Network Virtualization in OpenShift Kubernetes, Isolation software-only SDN, Network Virtualization solid state drives (SSDs), SAS, Nearline SAS, or SATA (or SSDs)?


Learning Flask Framework by Matt Copperwaite, Charles Leifer

create, read, update, delete, database schema, Debian, DevOps, don't repeat yourself, duck typing, full text search, place-making, Skype, SQL injection, web application

We covered the benefits of using a relational database and an ORM, configured a Flask application to connect to a relational database, and created SQLAlchemy models. All this allowed us to create relationships between our data and perform queries. To top it off, we also used a migration tool to handle future database schema changes. In Chapter 3, Templates and Views we will set aside the interactive interpreter and start creating views to display blog entries in the web browser. We will put all our SQLAlchemy knowledge to work by creating interesting lists of blog entries, as well as a simple search feature. We will build a set of templates to make the blogging site visually appealing, and learn how to use the Jinja2 templating language to eliminate repetitive HTML coding.

Below the tags relationship, add the following code to the Entry model definition: class Entry(db.Model): # ... tags = db.relationship('Tag', secondary=entry_tags, backref=db.backref('entries', lazy='dynamic')) comments = db.relationship('Comment', backref='entry', lazy='dynamic') [ 148 ] Chapter 7 We've specified the relationship as lazy='dynamic', which, as you will recall from Chapter 5, Authenticating Users, means that the comments attribute on any given Entry instance will be a filterable query. Creating a schema migration In order to start using our new model, we need to update our database schema. Using the manage.py helper, create a schema migration for the Comment model: (blog) $ python manage.py db migrate INFO [alembic.migration] Context impl SQLiteImpl. INFO [alembic.migration] Will assume non-transactional DDL. INFO [alembic.autogenerate.compare] Detected added table 'comment' Generating /home/charles/projects/blog/app/migrations/ versions/490b6bc5f73c_.py ... done Then apply the migration by running upgrade: (blog) $ python manage.py db upgrade INFO [alembic.migration] Context impl SQLiteImpl.


pages: 933 words: 205,691

Hadoop: The Definitive Guide by Tom White

Amazon Web Services, bioinformatics, business intelligence, business logic, combinatorial explosion, data science, database schema, Debian, domain-specific language, en.wikipedia.org, exponential backoff, fallacies of distributed computing, fault tolerance, full text search, functional programming, Grace Hopper, information retrieval, Internet Archive, Kickstarter, Large Hadron Collider, linked data, loose coupling, openstreetmap, recommendation engine, RFID, SETI@home, social graph, sparse data, web application

Start it using the following commands: % export ANT_LIB=/path/to/ant/lib % hive --service hwi (You only need to set the ANT_LIB environment variable if Ant’s library is not found in /opt/ant/lib on your system.) Then navigate to http://localhost:9999/hwi in your browser. From there, you can browse Hive database schemas and create sessions for issuing commands and queries. It’s possible to run the web interface as a shared service to give users within an organization access to Hive without having to install any client software. There are more details on the Hive Web Interface on the Hive wiki at https://cwiki.apache.org/confluence/display/Hive/HiveWebInterface.

The data typically resides in HDFS, although it may reside in any Hadoop filesystem, including the local filesystem or S3. Hive stores the metadata in a relational database—and not in HDFS, say (see The Metastore). In this section, we shall look in more detail at how to create tables, the different physical storage formats that Hive offers, and how to import data into them. Multiple Database/Schema Support Many relational databases have a facility for multiple namespaces, which allow users and applications to be segregated into different databases or schemas. Hive supports the same facility, and provides commands such as CREATE DATABASE dbname, USE dbname, and DROP DATABASE dbname. You can fully qualify a table by writing dbname.tablename.

Chapter 2 (“Installing and Upgrading MySQL”) in particular should help. Users of Debian-based Linux systems (e.g., Ubuntu) can type sudo apt-get install mysql-client mysql-server. RedHat users can type sudo yum install mysql mysql-server. Now that MySQL is installed, let’s log in and create a database (Example 15-1). Example 15-1. Creating a new MySQL database schema % mysql -u root -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 349 Server version: 5.1.37-1ubuntu5.4 (Ubuntu) Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. mysql> CREATE DATABASE hadoopguide; Query OK, 1 row affected (0.02 sec) mysql> GRANT ALL PRIVILEGES ON hadoopguide.* TO '%'@'localhost'; Query OK, 0 rows affected (0.00 sec) mysql> GRANT ALL PRIVILEGES ON hadoopguide.* TO ''@'localhost'; Query OK, 0 rows affected (0.00 sec) mysql> quit; Bye The password prompt above asks for your root user password.


pages: 292 words: 62,575

97 Things Every Programmer Should Know by Kevlin Henney

A Pattern Language, active measures, Apollo 11, business intelligence, business logic, commoditize, continuous integration, crowdsourcing, database schema, deliberate practice, domain-specific language, don't repeat yourself, Donald Knuth, fixed income, functional programming, general-purpose programming language, Grace Hopper, index card, inventory management, job satisfaction, level 1 cache, loose coupling, machine readable, Silicon Valley, sorting algorithm, The Wisdom of Crowds

If an object has many possible variations in its behavior, these behaviors can be injected using the Strategy pattern rather than large if-then structures. In fact, the formulation of design patterns themselves is an attempt to reduce the duplication of effort required to solve common problems and discuss such solutions. In addition, DRY can be applied to structures, such as database schema, resulting in normalization. A Matter of Principle Other software principles are also related to DRY. The Once and Only Once principle, which applies only to the functional behavior of code, can be thought of as a subset of DRY. The Open/Closed Principle, which states that "software entities should be open for extension, but closed for modification," only works in practice when DRY is followed.

However, the problem is that the three functions change for entirely different reasons. The calculatePay function will change whenever the business rules for calculating pay do. The reportHours function will change whenever someone wants a different format for the report. The save function will change whenever the DBAs change the database schema. These three reasons to change combine to make Employee very volatile. It will change for any of those reasons. More importantly, any classes that depend upon Employee will be affected by those changes. Good system design means that we separate the system into components that can be independently deployed.


pages: 470 words: 109,589

Apache Solr 3 Enterprise Search Server by Unknown

bioinformatics, business logic, continuous integration, database schema, en.wikipedia.org, fault tolerance, Firefox, full text search, functional programming, information retrieval, natural language processing, performance metric, platform as a service, Ruby on Rails, SQL injection, Wayback Machine, web application

In order to make it easier for you to play with this data, the online code supplement to this book includes the data in formats that can readily be imported into Solr. Alternatively, if you already have your own data then I recommend starting with that, using this book as a guide. The MusicBrainz.org database is highly relational. Therefore, it will serve as an excellent instructional data set to discuss Solr schema choices. The MusicBrainz database schema is quite complex, and it would be a distraction to go over even half of it. I'm going to use a subset of it and express it in a way that has a straightforward mapping to the user interface seen on the MusicBrainz website. Each of these tables depicted in the following diagram can be easily constructed through SQL sub-queries or views from the actual MusicBrainz tables.

Even if you have one type of data to search for in an application, you might still use multiple cores (with the same configuration) and shard the data for scaling. Managing Solr Cores is discussed further in the deployment chapter. One combined index A combined index might also be called an aggregate index. As mentioned in the first chapter, an index is conceptually like a single-table relational database schema, thus sharing similarities with some NoSQL (non-relational) databases. In spite of this limitation, there is nothing to stop you from putting different types of data (say, artists and releases from MusicBrainz) into a single index. All you have to do is use different fields for the different document types, and use a field to discriminate between the types.

If it is not specified, then it defaults to the column name. When a column in the result can be placed directly into Solr without further processing, there is no need to specify the field declaration, because it is implied. When importing from a database, use the SQL AS keyword to use the same names as the Solr schema instead of the database schema. This reduces the number of<field/> elements and shortens existing ones. An attribute of the entity declaration that we didn't mention yet is transformer. This declares a comma-separated list of transformers that create, modify, and delete fields and even entire documents. The transformers are evaluated in-order, which can be significant.


Programming Computer Vision with Python by Jan Erik Solem

augmented reality, computer vision, database schema, en.wikipedia.org, optical character recognition, pattern recognition, text mining, Thomas Bayes, web application

Indexing images in this context means extracting descriptors from the images, converting them to visual words using a vocabulary, and storing the visual words and word histograms with information about which image they belong to. This will make it possible to query the database using an image and get the most similar images back as search result. Table 7-1. A simple database schema for storing images and visual words. imlist imwords imhistograms rowid imid imid filename wordid histogram vocname vocname Here we will use SQLite as database. SQLite is a database that stores everything in a single file and is very easy to set up and use. We are using it here since it is the easiest way to get started without having to go into database and server configurations and other details way outside the scope of this book.

SQLite is imported from the pysqlite2 module (see Appendix A for installation details). The Indexer class connects to a database and stores a vocabulary object upon creation (where the __init__() method is called). The __del__() method makes sure to close the database connection and db_commit() writes the changes to the database file. We only need a very simple database schema of three tables. The table imlist contains the filenames of all indexed images, and imwords contains a word index of the words, which vocabulary was used, and which images the words appear in. Finally, imhistograms contains the full word histograms for each image. We need those to compare images according to our vector space model.


pages: 485 words: 74,211

Developing Web Applications with Haskell and Yesod by Michael Snoyman

create, read, update, delete, database schema, Debian, domain-specific language, don't repeat yourself, full text search, functional programming, MVC pattern, web application

Your table might look something like: CREATE TABLE Person(id SERIAL PRIMARY KEY, name VARCHAR NOT NULL, age INTEGER) And if you are using a database like PostgreSQL, you can be guaranteed that the database will never store some arbitrary text in your age field. (The same cannot be said of SQLite, but let’s forget about that for now.) To mirror this database table, you would likely create a Haskell data type that looks something like: data Person = Person { personName :: Text , personAge :: Int } It looks like everything is type safe: the database schema matches our Haskell data types, the database ensures that invalid data can never make it into our data store, and everything is generally awesome. Well, until: You want to pull data from the database, and the database layer gives you the data in an untyped format. You want to find everyone older than 32, and you accidently write “thirtytwo” in your SQL statement.

Suppose that we now want to add a field for a person’s favorite programming language: {-# LANGUAGE QuasiQuotes, TypeFamilies, GeneralizedNewtypeDeriving, TemplateHaskell, OverloadedStrings, GADTs, FlexibleContexts #-} import Database.Persist import Database.Persist.Sqlite import Database.Persist.TH import Data.Time share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persist| Person name String age Int Maybe created UTCTime default=now() language String default='Haskell' |] main = withSqliteConn ":memory:" $ runSqlConn $ do runMigration migrateAll Note The default attribute has absolutely no impact on the Haskell code itself; you still need to fill in all values. This will only affect the database schema and automatic migrations. We need to surround the string with single quotes so that the database can properly interpret it. Finally, Persistent can use double quotes for containing white space, so if we want to set someone’s default home country to be El Salvador: {-# LANGUAGE QuasiQuotes, TypeFamilies, GeneralizedNewtypeDeriving, TemplateHaskell, OverloadedStrings, GADTs, FlexibleContexts #-} import Database.Persist import Database.Persist.Sqlite import Database.Persist.TH import Data.Time share [mkPersist sqlSettings, mkMigrate "migrateAll"] [persist| Person name String age Int Maybe created UTCTime default=now() language String default='Haskell' country String "default='El Salvador'" |] main = withSqliteConn ":memory:" $ runSqlConn $ do runMigration migrateAll One last trick you can do with attributes is to specify the names to be used for the SQL tables and columns.


Team Topologies: Organizing Business and Technology Teams for Fast Flow by Matthew Skelton, Manuel Pais

anti-pattern, business logic, business process, call centre, cognitive load, continuous integration, Conway's law, database schema, DevOps, different worldview, Dunbar number, holacracy, information security, Infrastructure as a Service, Internet of things, Jeff Bezos, Kanban, Kickstarter, knowledge worker, Kubernetes, Lean Startup, loose coupling, meta-analysis, microservices, Norbert Wiener, operational security, platform as a service, pull request, remote working, systems thinking, two-pizza team, web application

Such applications are typically deployed as a unit, often causing headaches for users (the application is not available during deployment) and operators (unexpected issues because the production environment is a moving target; even if we tested the monolith in an environment similar to production, it has surely drifted since then). Joined-at-the-Database Monolith A joined-at-the-database monolith is composed of several applications or services, all coupled to the same database schema, making them difficult to change, test, and deploy separately. This monolith often results from the organization viewing the database, not the services, as the core business engine. It’s common to find that one or more database-administration (DBA) teams were put in place to not only maintain the database but also coordinate changes to the database—a task they are often understaffed for—and they become a large bottleneck to delivery.

., “How should this service interact with the ABC service?”). intrinsic cognitive load: relates to aspects of the task fundamental to the problem space (e.g., “What is the structure of a Java class?” “How do I create a new method?”). joined-at-the-database monolith: composed of several applications or services all coupled to the same database schema, making them difficult to change, test, and deploy separately. monolithic build: uses one gigantic continuous integration (CI) build to get a new version of a component. monolithic model: software that attempts to force a single domain language and representation (format) across many different contexts.


pages: 1,237 words: 227,370

Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, billion-dollar mistake, bitcoin, blockchain, business intelligence, business logic, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, data science, database schema, deep learning, DevOps, distributed ledger, Donald Knuth, Edward Snowden, end-to-end encryption, Ethereum, ethereum blockchain, exponential backoff, fake news, fault tolerance, finite state, Flash crash, Free Software Foundation, full text search, functional programming, general-purpose programming language, Hacker News, informal economy, information retrieval, Infrastructure as a Service, Internet of things, iterative process, John von Neumann, Ken Thompson, Kubernetes, Large Hadron Collider, level 1 cache, loose coupling, machine readable, machine translation, Marc Andreessen, microservices, natural language processing, Network effects, no silver bullet, operational security, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, SQL injection, statistical model, surveillance capitalism, systematic bias, systems thinking, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

If you use Avro, you can fairly easily generate an Avro schema (in the JSON representation we saw earlier) from the relational schema and encode the database contents using that schema, dumping it all to an Avro object container file [25]. You generate a record schema for each database table, and each column becomes a field in that record. The column name in the database maps to the field name in Avro. Now, if the database schema changes (for example, a table has one column added and one column removed), you can just generate a new Avro schema from the updated database schema and export data in the new Avro schema. The data export process does not need to pay any attention to the schema change—it can simply do the schema conversion every time it runs. Anyone who reads the new data files will see that the fields of the record have changed, but since the fields are identified by name, the updated writer’s schema can still be matched up with the old reader’s schema.

In a document database, you would just start writing new documents with the new fields and have code in the application that handles the case when old documents are read. For example: if (user && user.name && !user.first_name) { // Documents written before Dec 8, 2013 don't have first_name user.first_name = user.name.split(" ")[0]; } On the other hand, in a “statically typed” database schema, you would typically perform a migration along the lines of: ALTER TABLE users ADD COLUMN first_name text; UPDATE users SET first_name = split_part(name, ' ', 1); -- PostgreSQL UPDATE users SET first_name = substring_index(name, ' ', 1); -- MySQL Schema changes have a bad reputation of being slow and requiring downtime.

Anyone who reads the new data files will see that the fields of the record have changed, but since the fields are identified by name, the updated writer’s schema can still be matched up with the old reader’s schema. By contrast, if you were using Thrift or Protocol Buffers for this purpose, the field tags would likely have to be assigned by hand: every time the database schema changes, an administrator would have to manually update the mapping from database column names to field tags. (It might be possible to automate this, but the schema generator would have to be very careful to not assign previously used field tags.) This kind of dynamically generated schema simply wasn’t a design goal of Thrift or Protocol Buffers, whereas it was for Avro.


Python Web Development With Django by Jeff Forcier

business logic, create, read, update, delete, database schema, Debian, don't repeat yourself, duck typing, en.wikipedia.org, Firefox, full text search, functional programming, Guido van Rossum, loose coupling, MVC pattern, revision control, Ruby on Rails, Silicon Valley, slashdot, SQL injection, web application

Introduction Ichange f you’re a Web developer, a programmer who creates Web sites, then Django just might your life. It has certainly changed ours. Anyone with even a little experience building dynamic Web sites knows the pain of reinventing certain standard features over and over.You need to create database schemas. You need to get data into and out of the database.You need to parse URLs.You need to sanitize input.You need to provide content-editing tools.You need to attend to security and usability.And so on. Where Web Frameworks Come From At some point you realize the wastefulness of reimplementing all these features on every new project; you decide to code your own libraries from scratch to provide them—or, more likely, you extract those libraries from your latest and greatest creation.Thereafter, when you start a new project, the first thing you do is install your library code.

Modeling a real-world problem in an object-oriented system is often a comparatively simple task, but for high-traffic Web sites the most realistic model isn’t always the most efficient. The model encompasses a wide range of potential pitfalls, one of which is changing the model code after your application has been deployed.Although you are “just changing code,” you have actually altered your database schema under the covers, and this often causes side effects to the preexisting data stored in the database.We go over many of these real-life concerns in the chapters ahead when exploring the design of some example applications. Views Views form much (sometimes most or all) of the logic in Django applications.Their definition is deceptively simple: Python functions are linked to one or more defined URLs, which return HTTP response objects.What happens in-between those two endpoints of Django’s HTTP mechanisms is entirely up to you.

Instead, it makes sure all model classes are represented as database tables, creating new tables as necessary— but not altering existing ones. Database “Synchronization” The reasoning behind syncdb’s behavior is Django’s core development team strongly believes one’s production data should never be at the mercy of an automated process. Additionally, it is a commonly held belief that changes to database schemas should only be performed when a developer understands SQL well enough to execute those changes by hand. The authors tend to agree with this approach; a better understanding of underlying technology is always preferable when developing with higher-layer tools. At the same time, an automatic or semi-automatic schema change-set mechanism (such as Rails’ migrations) can often speed up the development process.


pages: 318 words: 78,451

Kanban: Successful Evolutionary Change for Your Technology Business by David J. Anderson

airport security, anti-pattern, business intelligence, call centre, collapse of Lehman Brothers, continuous integration, corporate governance, database schema, domain-specific language, index card, Kaizen: continuous improvement, Kanban, knowledge worker, lateral thinking, loose coupling, performance metric, six sigma, systems thinking, tacit knowledge, Toyota Production System, transaction costs

While this may sound trivial, it often isn’t. Often it is necessary to plan an elaborate procedure to switch off databases, application servers, and other systems gracefully, and then upgrade them and bring them all back again. One of the biggest issues is data migration from one generation of a database schema to another. Databases can get very large. The process of serializing the data to a file, parsing it, unpacking it, perhaps embellishing or augmenting it with other data, then re-parsing it and unpacking it into a new schema can take hours—perhaps even days. In some environments, software deployment can take hours or days.

More often than that is common in fast-moving domains such as media, where the release cycles may be very frequent. Delivery/Release Now we must agree to a similar thing with downstream partners. A delivery cadence that makes sense is very specific to a domain or situation. If it’s web-based software, we have to deploy to a server farm. Deployment involves copying files and perhaps upgrading a database schema and then migrating data from one version of the schema to another. This data migration will probably have its own code and it will take its own time to execute. To calculate the total deployment time, you will need to factor in how many servers, how many files to copy, how long it takes to pull systems down gracefully and reboot them, how long data takes to migrate, and so forth.


PostGIS in Action by Regina O. Obe, Leo S. Hsu

business logic, call centre, crowdsourcing, database schema, Debian, domain-specific language, en.wikipedia.org, Firefox, functional programming, Google Earth, job automation, McMansion, Mercator projection, Network effects, null island, openstreetmap, planetary scale, profit maximization, Ruby on Rails, Skype, South of Market, San Francisco, SQL injection, traveling salesman, web application

CREATE SCHEMA ch01; In PostgreSQL it’s very easy to back up selected schemas and also to set up permissions based on schemas. You could, for example, have a big schema of fairly static data that you exclude from your daily backups, and also divide schemas along user groups so that you can allow each group to manage their own schema set of data. The postgis_in _action database schemas are chapter-themed so that it’s easy to download just the set of data you need for a specific chapter. Refer to appendix D for more details about schemas and security management. Restaurants table Next, you need to create a lookup table to map franchise codes to meaningful names. You can then add all the franchises you’ll be dealing with.

The main downside of Imposm is that it only runs on Linux and OS X, and not on Windows, whereas osm2pgsql has support for all the platforms PostGIS runs on. Imposm also currently doesn’t support OSM differential loads, so it’s not a good solution if you have a large OSM data set and want to perform differential updates as things get updated on OSM. One key benefit of Imposm is that it supports custom database schemas, so you can better control the table structures of your data as part of the load process. You can find out more about Imposm at http://imposm.org. Visit www.openstreetmap.org/export to download both utilities. Now that we’ve covered the common free options available for loading data, let’s test these tools. 4.4.1.

Open up the file in an editor and make changes to the path and database connection as appropriate. Once you’ve made the changes and double-checked your typing, go ahead and execute the script file from your OS command line. After you’ve executed the script, you should see a couple of new tables in your tiger_data database schema. The two of most importance are states_all and county_all. Now that you’ve had some practice at loading data, let’s generate a script to load individual state data. If you’re on Linux, Unix, or Mac, run listing 8.4. If you’re on Windows, run listing 8.5. If you created a custom profile in the tiger.loader_platform table, use that name instead of sh or windows.


pages: 448 words: 84,462

Testing Extreme Programming by Lisa Crispin, Tip House

business logic, c2.com, continuous integration, data acquisition, database schema, Donner party, Drosophila, fail fast, hypertext link, index card, job automation, systems thinking, web application

This is definitely an area where you may need to wait for the system to be available and then play with it to get an idea of how to abuse it. These kinds of tests require a lot of creativity; have fun with them. Sometimes it's tough to even think how to test a particular item. For example, what if a story implements a data model within the database schema but no story uses it in a user interface? It may be that this type of test is better handled by the unit tests—it may not even be appropriate to have a separate acceptance test. Consider whether it impinges on the customer in some indirect way, and design tests for that. After all, if the customer really can't tell whether the story has been implemented or not, that should be a pretty quick story to implement, eh?

Each test was unique code from start to finish. A SubmitRequest for content item A would not work for submitting a request for content item B. Very little code could be shared between test scripts. The team was next asked to develop a similar application for a different Web site but was allowed to design the database schema. They were able to make this application much more generic, using only four tables. The object model needed only nine classes. Now the SubmitRequest test script worked for every type of content. This made test script development go much faster, and the scripts were vastly easier to maintain. Rethinking the design for all the content management, the team came up with a design needing just two tables, where a single definition of content could be used to represent all types of information.


Beautiful Data: The Stories Behind Elegant Data Solutions by Toby Segaran, Jeff Hammerbacher

23andMe, airport security, Amazon Mechanical Turk, bioinformatics, Black Swan, business intelligence, card file, cloud computing, computer vision, correlation coefficient, correlation does not imply causation, crowdsourcing, Daniel Kahneman / Amos Tversky, DARPA: Urban Challenge, data acquisition, data science, database schema, double helix, en.wikipedia.org, epigenetics, fault tolerance, Firefox, Gregor Mendel, Hans Rosling, housing crisis, information retrieval, lake wobegon effect, Large Hadron Collider, longitudinal study, machine readable, machine translation, Mars Rover, natural language processing, openstreetmap, Paradox of Choice, power law, prediction markets, profit motive, semantic web, sentiment analysis, Simon Singh, social bookmarking, social graph, SPARQL, sparse data, speech recognition, statistical model, supply-chain management, systematic bias, TED Talk, text mining, the long tail, Vernor Vinge, web application

This is perhaps oversimplifying our database design process, however. I should back up a bit. We are a group of 10 or so graduate students with our own research interests, and as expected, work on individual components of PEIR. This affected how we work a great deal. PEIR data was very scattered to begin with. We did not use a unified database schema; we created multiple databases as we needed them, and did not follow any specific design patterns. If anyone joined PEIR during this mid-early stage, he would have been confused by where and what all the data was and who to contact to find out. I say this because I joined the PEIR project midway.

It became quite clear that this consolidation of code and schemas was necessary once user experience development began. In retrospect, it would have been worth the extra effort to take a more calculated approach to data storage in the early goings, but such is the nature of graduate studies. Coordination and code consolidation are not an issue with YFD, since there is only one developer. I can change the database schema, user interface, and data collection mechanism with little fuss. I also use Django, a Python web framework, which uses a modelview-control approach and allows for rapid and efficient development. I do, however, have to do everything myself. Because of the group’s diversity in statistics, computer science, engineering, GIS, and environmental science, PEIR is able to accomplish more— most notably in the area of data processing, as discussed in the next section.

To balance the need for effective provenance collection with a time-sensitive, highly dynamic workflow, modern approaches to data capture are required: we never want software to be the rate-limiting step in a workflow. Flexible Data Capture Instead of classical database refactoring, typically involving adding, removing, or renaming fields on a database schema, the high-throughput production pipeline at Sanger uses a collection of data modeling tools that allow data descriptions to be changed at runtime. When a laboratory protocol is updated to include a new set of instruments or a faster approach to an existing step, the provenance information captured that relates to that task can be restructured on the fly by laboratory staff.


pages: 713 words: 93,944

Seven Databases in Seven Weeks: A Guide to Modern Databases and the NoSQL Movement by Eric Redmond, Jim Wilson, Jim R. Wilson

AGPL, Amazon Web Services, business logic, create, read, update, delete, data is the new oil, database schema, Debian, domain-specific language, en.wikipedia.org, fault tolerance, full text search, general-purpose programming language, Kickstarter, Large Hadron Collider, linked data, MVC pattern, natural language processing, node package manager, random walk, recommendation engine, Ruby on Rails, seminal paper, Skype, social graph, sparse data, web application

We’ll begin with the many ways that PostgreSQL can search actor/movie names using fuzzy string matching. Then we’ll discover the cube package by creating a movie suggestion system based on similar genres of movies we already like. Since these are all contributed packages, the implementations are special to PostgreSQL and not part of the SQL standard. Commonly, when designing a relational database schema, you’ll start with an entity diagram. We’ll be writing a personal movie suggestion system that keeps track of movies, their genres, and their actors, as modeled in Figure 6, ​Our movie suggestion system​. * * * Figure 6. Our movie suggestion system As a reminder, on Day 1 we installed several contributed packages.

Also, because of the nature of documents, they often map well to object-oriented programming models. This means less impedance mismatch when moving data between the database model and application model. Not-So-Good For: If you’re used to performing elaborate join queries on highly normalized relational database schemas, you’ll find the capabilities of document databases lacking. A document should generally contain most or all of the relevant information required for normal use. So while in a relational database you’d naturally normalize your data to reduce or eliminate copies that can get out of sync, with document databases, denormalized data is the norm.


pages: 398 words: 86,855

Bad Data Handbook by Q. Ethan McCallum

Amazon Mechanical Turk, asset allocation, barriers to entry, Benoit Mandelbrot, business intelligence, cellular automata, chief data officer, Chuck Templeton: OpenTable:, cloud computing, cognitive dissonance, combinatorial explosion, commoditize, conceptual framework, data science, database schema, DevOps, en.wikipedia.org, Firefox, Flash crash, functional programming, Gini coefficient, hype cycle, illegal immigration, iterative process, labor-force participation, loose coupling, machine readable, natural language processing, Netflix Prize, One Laptop per Child (OLPC), power law, quantitative trading / quantitative finance, recommendation engine, selection bias, sentiment analysis, SQL injection, statistical model, supply-chain management, survivorship bias, text mining, too big to fail, web application

For example, in a comma-separated file, the newline character separates records, and commas separate fields. Picking a delimiter lives in a class of problems that sucks more energy than is possible to imagine. I know of a production system that uses four bars (||||) to delimit strings, to work around not being able to modify the database schema and a fear of modifying the underlying data to escape text. Paying homage to http://xkcd.com/927/, the ASCII standard includes delimiter characters that are almost never used. ASCII 30 is the record separator and ASCII 31 is designated for delimiting fields. In principle, user-facing systems reading in this data could neatly align things together in a tabular form.

Whatever our schema, using SQL and RDBMS technology to solve graph problems courts an impedance mismatch often seen in object-oriented systems backed by relational databases. Impedance matching originates in the field of electrical engineering and is the process of matching source and load impedance to maximize power transfer and minimize signal reflection in the transmission line. The object-relational impedance mismatch is a metaphor applying when a relational database schema (source) is at odds with the target object model in object-oriented languages such as Java or Ruby (load), necessitating a mapping between the two (signal reflection/power loss). Object-relational mapping (ORM) frameworks such as Hibernate, MyBatis, and ActiveRecord attempt to abstract away this complexity.


Designing Data-Intensive Applications: The Big Ideas Behind Reliable, Scalable, and Maintainable Systems by Martin Kleppmann

active measures, Amazon Web Services, billion-dollar mistake, bitcoin, blockchain, business intelligence, business logic, business process, c2.com, cloud computing, collaborative editing, commoditize, conceptual framework, cryptocurrency, data science, database schema, deep learning, DevOps, distributed ledger, Donald Knuth, Edward Snowden, end-to-end encryption, Ethereum, ethereum blockchain, exponential backoff, fake news, fault tolerance, finite state, Flash crash, Free Software Foundation, full text search, functional programming, general-purpose programming language, Hacker News, informal economy, information retrieval, Internet of things, iterative process, John von Neumann, Ken Thompson, Kubernetes, Large Hadron Collider, level 1 cache, loose coupling, machine readable, machine translation, Marc Andreessen, microservices, natural language processing, Network effects, no silver bullet, operational security, packet switching, peer-to-peer, performance metric, place-making, premature optimization, recommendation engine, Richard Feynman, self-driving car, semantic web, Shoshana Zuboff, social graph, social web, software as a service, software is eating the world, sorting algorithm, source of truth, SPARQL, speech recognition, SQL injection, statistical model, surveillance capitalism, systematic bias, systems thinking, Tragedy of the Commons, undersea cable, web application, WebSocket, wikimedia commons

If you use Avro, you can fairly easily generate an Avro schema (in the JSON representation we saw earlier) from the relational schema and encode the database contents using that schema, dumping it all to an Avro object container file [25]. You generate a record schema for each database table, and each column becomes a field in that record. The column name in the database maps to the field name in Avro. Now, if the database schema changes (for example, a table has one column added and one column removed), you can just generate a new Avro schema from the updated database schema and export data in the new Avro schema. The data export process does not need to pay any attention to the schema change—it can simply do the schema conversion every time it runs. Anyone who reads the new data files will see that the fields of the record have changed, but since the fields are identified by name, the updated writer’s schema can still be matched up with the old reader’s schema.

In a document database, you would just start writing new documents with the new fields and have code in the application that handles the case when old documents are read. For example: if (user && user.name && !user.first_name) { // Documents written before Dec 8, 2013 don't have first_name user.first_name = user.name.split(" ")[0]; } On the other hand, in a “statically typed” database schema, you would typically per‐ form a migration along the lines of: ALTER TABLE users ADD COLUMN first_name text; UPDATE users SET first_name = split_part(name, ' ', 1); -- PostgreSQL UPDATE users SET first_name = substring_index(name, ' ', 1); -- MySQL Schema changes have a bad reputation of being slow and requiring downtime.

Anyone who reads the new data files will see that the fields of the record have changed, but since the fields are identified by name, the updated writer’s schema can still be matched up with the old reader’s schema. By contrast, if you were using Thrift or Protocol Buffers for this purpose, the field tags would likely have to be assigned by hand: every time the database schema changes, an administrator would have to manually update the mapping from data‐ base column names to field tags. (It might be possible to automate this, but the schema generator would have to be very careful to not assign previously used field 126 | Chapter 4: Encoding and Evolution tags.) This kind of dynamically generated schema simply wasn’t a design goal of Thrift or Protocol Buffers, whereas it was for Avro.


PostGIS in Action, 2nd Edition by Regina O. Obe, Leo S. Hsu

business logic, call centre, crowdsourcing, database schema, Debian, domain-specific language, en.wikipedia.org, Firefox, functional programming, Google Earth, job automation, McMansion, megacity, Mercator projection, Network effects, null island, openstreetmap, planetary scale, profit maximization, Ruby on Rails, Skype, South of Market, San Francisco, SQL injection, traveling salesman, web application

CREATE SCHEMA ch01; In PostgreSQL it’s very easy to back up selected schemas and also to set up permissions based on schemas. You could, for example, have a big schema of fairly static data that you exclude from your daily backups, and also divide schemas along user groups so that you can allow each group to manage their own schema set of data. The postgis_in _action database schemas are chapter-themed so that it’s easy to download just the set of data you need for a specific chapter. Refer to appendix D for more details about schemas and security management. RESTAURANTS TABLE Next, you need to create a lookup table to map franchise codes to meaningful names. You can then add all the franchises you’ll be dealing with.

The main downside of Imposm is that it only runs on Linux and OS X, and not on Windows, whereas osm2pgsql has support for all the platforms PostGIS runs on. Imposm also currently doesn’t support OSM differential loads, so it’s not a good solution if you have a large OSM data set and want to perform differential updates as things get updated on OSM. One key benefit of Imposm is that it supports custom database schemas, so you can better control the table structures of your data as part of the load process. You can find out more about Imposm at http://imposm.org. Visit www.openstreetmap .org/export to download both utilities. Now that we’ve covered the common free options available for loading data, let’s test these tools. 4.4.1 Getting OSM data You can choose to download and load the whole OSM database, which is about 16 GB in size, or you can download extracts of key areas such as those available at http:// download.geofabrik.de/ or http://metro.teczno.com.

Open up the file in an editor and make changes to the path and database connection as appropriate. Once you’ve made the changes and double-checked your typing, go ahead and execute the script file from your OS command line. After you’ve executed the script, you should see a couple of new tables in your tiger_data database schema. The two of most importance are states_all and county_all. Now that you’ve had some practice at loading data, let’s generate a script to load individual state data. If you’re on Linux, Unix, or Mac, run listing 8.4. If you’re on Windows, run listing 8.5. If you created a custom profile in the tiger.loader_platform table, use that name instead of sh or windows.


pages: 153 words: 27,424

REST API Design Rulebook by Mark Masse

anti-pattern, business logic, conceptual framework, create, read, update, delete, data acquisition, database schema, hypertext link, information retrieval, off-the-grid, web application

For example, this URI design: http://api.soccer.restapi.org/leagues/seattle/teams/trebuchet indicates that each of these URIs should also identify an addressable resource: http://api.soccer.restapi.org/leagues/seattle/teams http://api.soccer.restapi.org/leagues/seattle http://api.soccer.restapi.org/leagues http://api.soccer.restapi.org Resource modeling is an exercise that establishes your API’s key concepts. This process is similar to the data modeling for a relational database schema or the classical modeling of an object-oriented system. Before diving directly into the design of URI paths, it may be helpful to first think about the REST API’s resource model. Resource Archetypes When modeling an API’s resources, we can start with the some basic resource archetypes.


pages: 696 words: 111,976

SQL Hacks by Andrew Cumming, Gordon Russell

Apollo 13, bioinformatics, book value, business intelligence, business logic, business process, database schema, en.wikipedia.org, Erdős number, Firefox, full text search, Hacker Conference 1984, Hacker Ethic, leftpad, Paul Erdős, SQL injection, Stewart Brand, web application

The marshalling of columns for transport, sending them to your code, and then splitting them up again all require CPU time and effort. You could perform the addition in the database system and save all that code. If lots of code needs those rows added, you could put that processing into a view. This might be more reliable. It would certainly promote code reuse, and would offer a single point of control so that when the database schema is altered, you've just got to modify the view. Put another way, the choice comes down to this: mysql> CREATE TABLE widetable (a INT,b INT, c INT, d INT, e INT); Query OK, 0 rows affected (0.00 sec) mysql> INSERT INTO widetable VALUES (5,10,19,11,3); Query OK, 1 row affected (0.00 sec) mysql> SELECT a,b,c,d,e FROM widetable; +------+------+------+------+------+ | a | b | c | d | e | +------+------+------+------+------+ | 5 | 10 | 19 | 11 | 3 | +------+------+------+------+------+ 1 row in set (0.00 sec) Then, in your programming language, do (5+10+19+11+3), throw away a,b,c,d,e, and just use the value 48 or this: mysql> SELECT a+b+c+d+e AS v FROM widetable; +------+ | v | +------+ | 48 | +------+ 1 row in set (0.00 sec) In addition to arithmetic, other calculations may best be done in the database.

Another approach is to have the table names combined with a program variable, and then to set the variable to the new prefix. 11.3.2. Keep Your CREATE Script When I am working on a new application one of the first things I do is to create a file called mktable.sql that has the CREATE script for every table being used. Of course, I never get the database schema right the first time, so I have to edit and run the mktable.sql script many times before I have it the way I want it: DROP TABLE casting; DROP TABLE movie; DROP TABLE actor; CREATE TABLE actor (id INTEGER NOT NULL ,name VARCHAR(35) ,PRIMARY KEY (id) ); CREATE INDEX actor_name ON actor(name); CREATE TABLE movie (id INTEGER NOT NULL ,title VARCHAR(70) ,yr DECIMAL(4) ,score FLOAT ,votes INTEGER ,director INTEGER ,PRIMARY KEY (id) ,FOREIGN KEY (director) REFERENCES actor(id) ); CREATE INDEX movie_title ON movie(title); CREATE TABLE casting (movieid INTEGER NOT NULL ,actorid INTEGER NOT NULL ,ord INTEGER ,PRIMARY KEY (movieid, actorid) ,FOREIGN KEY (movieid) REFERENCES movie(id) ,FOREIGN KEY (actorid) REFERENCES actor(id) ); Notice that the DROP commands come before the CREATE commands.


pages: 648 words: 108,814

Solr 1.4 Enterprise Search Server by David Smiley, Eric Pugh

Amazon Web Services, bioinformatics, cloud computing, continuous integration, database schema, domain-specific language, en.wikipedia.org, fault tolerance, Firefox, information retrieval, Ruby on Rails, SQL injection, Wayback Machine, web application, Y Combinator

Another interesting bit of data MusicBrainz stores is the PUID "lookup count", which is how often it has been requested by their servers—a decent measure of popularity. MusicBrainz uses the proprietary "MusicDNS" audio fingerprinting technology, which was donated to the project by MusicIP. It is not open source. One combined index or multiple indices As mentioned in the first chapter, technically, an index is basically like a single-table database schema. Imagine a massive spreadsheet, if you will. Inspite of this limitation, there is nothing to stop you from putting different types of data (say, artists and tracks from MusicBrainz) into a single index, thereby, in effect mitigating this limitation. All you have to do is use different fields for the different document types, and use a field to discriminate between the types.

The field element must have a column attribute that matches the corresponding named column in the SQL query. The name attribute is the Solr schema field name that the column is going into. If it is not specified (and it never is for our example), then it defaults to the column name. Use the SQL as a keyword as we've done to use the same names as the Solr schema instead of the database schema. This reduces the number of explicit mappings needed in <field/> elements and shortens existing ones. When a column in the result can be placed directly into Solr without further processing, there is no need to specify the field declaration, because it is implied. An attribute of the entity declaration that we didn't mention yet is transformer.


pages: 999 words: 194,942

Clojure Programming by Chas Emerick, Brian Carper, Christophe Grand

Amazon Web Services, Benoit Mandelbrot, cloud computing, cognitive load, continuous integration, database schema, domain-specific language, don't repeat yourself, drop ship, duck typing, en.wikipedia.org, failed state, finite state, Firefox, functional programming, game design, general-purpose programming language, Guido van Rossum, higher-order functions, Larry Wall, mandelbrot fractal, no silver bullet, Paul Graham, platform as a service, premature optimization, random walk, Ruby on Rails, Schrödinger's Cat, semantic web, software as a service, sorting algorithm, SQL injection, Turing complete, type inference, web application

As a bonus, eliminating synthetic identifiers means that different threads or even processes no longer need to communicate to assign unique identifiers to chunks of data, because you can derive natural identifiers from the data at hand. The debate of the respective virtues of synthetic keys versus natural keys is not new, but is still very much alive when it comes to database schema design. The arguments for synthetic identifiers revolve around two preoccupations: that data is going to survive the process that created it (and the rules it embodies), and that data is hard to refactor. The implication is that synthetic identifiers are of little use for in-process or otherwise application-level computation.

If all we want is to run a query and fetch the results in their entirety, we can easily set up a utility function to do this for us: (defn fetch-results [db-spec query] (jdbc/with-connection db-spec (jdbc/with-query-results res query (doall res)))) ;= #'user/fetch-results (fetch-results db-spec ["SELECT * FROM authors"]) ;= ({:id 1, :first_name "Chas", :last_name "Emerick"} ;= {:id 2, :first_name "Christophe", :last_name "Grand"} ;= {:id 3, :first_name "Brian", :last_name "Carper"}) Transactions Performing database operations as a single transaction is a simple matter of wrapping your code in a transaction form. transaction accepts any number of forms and executes each in turn as part of a single transaction. If an exception is thrown, or if the operation(s) performed will violate a constraint established in your database schema, the transaction will be rolled back. If the body of transaction finishes executing without error, the transaction is committed. (jdbc/with-connection db-spec (jdbc/transaction (jdbc/delete-rows :authors ["id = ?" 1]) (throw (Exception. "Abort transaction!")))) ;= ; Exception Abort transaction!

DOCTYPE hibernate-configuration SYSTEM "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd"> <hibernate-configuration> <session-factory> <property name="hibernate.connection.driver_class">org.sqlite.JDBC</property> <property name="hibernate.connection.url">jdbc:sqlite::memory:</property> <property name="hibernate.dialect">org.hibernate.dialect.HSQLDialect</property> <!-- Drop and re-create the database schema on startup --> <property name="hbm2ddl.auto">create</property> <mapping class="com.clojurebook.hibernate.Author"/> </session-factory> </hibernate-configuration> Finally, if we’re using Leiningen, we just need to add a couple of keys to our project.clj; one to indicate the source root for Java code we’d like it to compile (i.e., Author.java), and another to indicate the resources root, where we placed our hibernate.cfg.xml file: :java-source-path "java" :resources-path "rsrc" This leaves our project structure looking like so: |-- project.clj |-- rsrc | `-- hibernate.cfg.xml |-- java | `-- com | `-- clojurebook | `-- hibernate | |-- Author.java We can now compile our Java class and start a REPL to start seeing how we can use Hibernate from Clojure: % lein javac ... % lein repl We first need to import the required Java classes from Hibernate, and our new Author class: (import 'org.hibernate.SessionFactory 'org.hibernate.cfg.Configuration 'com.clojurebook.hibernate.Author) Hibernate requires setting up a session factory object to allow us to open and close database sessions and run queries.


pages: 480 words: 122,663

The Art of SQL by Stephane Faroult, Peter Robson

business intelligence, business logic, business process, constrained optimization, continuation of politics by other means, database schema, full text search, invisible hand, it's over 9,000, leftpad, SQL injection, technological determinism

All technological solutions are merely means to an end; the great danger for the inexperienced developer is that the attractions of the latest technology become an end in themselves. And the danger is all the greater for enthusiastic, curious, and technically minded individuals! Important Foundations before Fashion: learn your craft before playing with the latest tools. Stable Database Schema The use of data definition language (DDL) to create, alter, or drop database objects inside an application is a very bad practice that in most cases should be banned. There is no reason to dynamically create, alter, or drop objects, with the possible exception of partitions—which I describe in Chapter 5--and temporary tables that are known to the DBMS to be temporary tables.

We therefore need three more tables: people to store information such as name, first name, sex, year of birth, and so on; roles to define how people may contribute to a movie (actor, director, but also composer, director of photography, and the like); and movie_credits to state who was doing what in which movie. Figure 8-3 shows our complete movie schema. Figure 8-3. The movie database schema Let's suppose now that we want to let people search movies in our database by specifying either: words from the title, the name of the director, or up to three names of any of the actors. Following is the source of our prototype page, which I have built in HTML to act as our screen display: <html> <head> <title>Movie Database</title> </head> <body> <CENTER> <HR> <BR> Please fill the form to query our database and click on <b>Search</b> when you are done...


Seeking SRE: Conversations About Running Production Systems at Scale by David N. Blank-Edelman

Affordable Care Act / Obamacare, algorithmic trading, AlphaGo, Amazon Web Services, backpropagation, Black Lives Matter, Bletchley Park, bounce rate, business continuity plan, business logic, business process, cloud computing, cognitive bias, cognitive dissonance, cognitive load, commoditize, continuous integration, Conway's law, crowdsourcing, dark matter, data science, database schema, Debian, deep learning, DeepMind, defense in depth, DevOps, digital rights, domain-specific language, emotional labour, en.wikipedia.org, exponential backoff, fail fast, fallacies of distributed computing, fault tolerance, fear of failure, friendly fire, game design, Grace Hopper, imposter syndrome, information retrieval, Infrastructure as a Service, Internet of things, invisible hand, iterative process, Kaizen: continuous improvement, Kanban, Kubernetes, loose coupling, Lyft, machine readable, Marc Andreessen, Maslow's hierarchy, microaggression, microservices, minimum viable product, MVC pattern, performance metric, platform as a service, pull request, RAND corporation, remote working, Richard Feynman, risk tolerance, Ruby on Rails, Salesforce, scientific management, search engine result page, self-driving car, sentiment analysis, Silicon Valley, single page application, Snapchat, software as a service, software is eating the world, source of truth, systems thinking, the long tail, the scientific method, Toyota Production System, traumatic brain injury, value engineering, vertical integration, web application, WebSocket, zero day

Instead, the go-to solution should be to deploy self-service capabilities at those handoff points. These self-service capabilities should provide pull-based interfaces to whatever was previously needed to be done by someone on the other end of a request queue (investigating performance issues, changing network/firewall settings, adding capacity, updating database schemas, restarts, etc.). The point of self-service is to stay out of the way of people who need an operations task completed. Rather than having someone fill out a ticket and sitting in a request queue, you give them a GUI, API, or command-line tool to do it themselves, when they need to do it. This capability eliminates wait time, shortens feedback loops, avoids miscommunication, and improves the labor capacity of the teams that previously had to field those requests, freeing them from repetitive requests so that they can focus on value-adding engineering work.

The team had always blamed the customers for going over its contracted rates, which is like a parent blaming their infant for eating dirt. Karen rightly expected that her nonoperations background would have been a benefit to the team. It’s not always clear whether a problem will require understanding a database schema, Ruby debugging, C++ performance understanding, product knowledge, or people skills. Karen proposed a new design based on technology she’d used during her internship. Her coworkers were unfamiliar with the new technology and immediately considered it too risky. Karen dropped her proposal without discussion.

Teams faced with supporting software written in multiple languages, with different underlying technologies and frameworks, spend a huge amount of time trying to understand the system and consequently have less time to improve it. To reduce complexity, software engineers deploy more and more abstractions. Abstractions can be like quicksand. Object/relational mappers (ORMs) are a wonderful example of a tool that can make a developer’s life easy by reducing the amount of time thinking about database schemas. By obviating the need for developers to understand the underlying schema, developers no longer consider how ORM changes impact production performance. Operations now need to understand the ORM layer and why it impacts the database. Monolithic designs are often easier to develop and extend than microservices.


pages: 1,266 words: 278,632

Backup & Recovery by W. Curtis Preston

Berlin Wall, business intelligence, business process, database schema, Debian, dumpster diving, failed state, fault tolerance, full text search, job automation, Kickstarter, operational security, rolling blackouts, side project, Silicon Valley, systems thinking, web application

Finding out if the datafile contains a rollback segment is a little more difficult, but it is still possible. If you use rman with a recovery catalog, you can connect to the catalog, set the DBID of the database, and issue the report schema command. This command displays the files, locations, and whether they contain rollback segments. RMAN> report schema; Report of database schema File Size(MB) Tablespace RB segs Datafile Name ---- ---------- ---------------- ------- ------------------- 1 307200 SYSTEM NO /oracle/oradata/trgt/system01.dbf 2 20480 UNDOTBS YES /oracle/oradata/trgt/undotbs01.dbf 3 10240 CWMLITE NO /oracle/oradata/trgt/cwmlite01.dbf 4 10240 DRSYS NO /oracle/oradata/trgt/drsys01.dbf 5 10240 EXAMPLE NO /oracle/oradata/trgt/example01.dbf 6 10240 INDX NO /oracle/oradata/trgt/indx01.dbf 7 10240 TOOLS NO /oracle/oradata/trgt/tools01.dbf 8 10240 USERS NO /oracle/oradata/trgt/users01.dbf To find out which datafiles are in the system tablespace, you’ll need to query sys.dba_data_files: SQL> select file_name, tablespace_name from sys.dba_data_files; /oracle/oradata/trgt/system01.dbf SYSTEM /oracle/oradata/trgt/undotbs01.dbf UNDOTBS /oracle/oradata/trgt/cwmlite01.dbf CWMLITE /oracle/oradata/trgt/drsys01.dbf DRSYS /oracle/oradata/trgt/example01.dbf EXAMPLE /oracle/oradata/trgt/indx01.dbf INDX /oracle/oradata/trgt/tools01.dbf TOOLS /oracle/oradata/trgt/users01.dbf USERS This example report shows three datafiles that are members of the system , sysaux, and undo tablespaces.

restoring, Restoring an RDBMS, Loss of Any Nondata Disk, Loss of a Data Disk, Loss of a Data Disk, Online Partial Restores, Documentation and Testing documentation and testing, Documentation and Testing loss of a data disk, Loss of a Data Disk loss of a nondata disk, Loss of Any Nondata Disk loss of the master database, Loss of a Data Disk online partial restores, Online Partial Restores rollback log, Rollback log row (tuple), Row segments, Tablespace set up information, recording, Take an Inventory tables, Table tablespace, Extents transaction, Transaction transaction log, Transaction log transaction logs, importance of, Don’t Go Overboard view, Table what can go wrong, What Can Happen to an RDBMS? DB2, DB2 Architecture, Managing archive logs, Instance, Databases, Schemas, Tables, Views, Indexes, DB2 engine dispatch units, System catalog tables, Database partition, Database partition, Containers, Tablespaces, Tablespaces, Tablespaces, Large objects (LOBs), Large objects (LOBs), Transaction logs, Transaction logs, Managing archive logs, Managing archive logs, Managing archive logs, The backup Command, Backup levels, Backup levels, Backup levels, Backup path and filenaming convention, Backup path and filenaming convention, Discovering the history of your backup operations, Automatic maintenance, Automatic maintenance, Crash recovery, Version recovery, Rollforward recovery, The restore Command, The rollforward Command, The recover Command, Performing an In-Place Version Recovery, Performing a Redirected Version Recovery, Performing a Rollforward Recovery, Performing a Rollforward Recovery, Reorganizing Data and Collecting Statistics architecture, DB2 Architecture, Managing archive logs, Instance, Databases, Schemas, Tables, Views, Indexes, DB2 engine dispatch units, System catalog tables, Database partition, Database partition, Containers, Tablespaces, Tablespaces, Tablespaces, Large objects (LOBs), Large objects (LOBs), Transaction logs, Transaction logs, Managing archive logs, Managing archive logs archive logging, Managing archive logs circular logging, Managing archive logs container, Containers database, Databases database managed spaces (DMS), Tablespaces engine dispatch units (EDUs), DB2 engine dispatch units index, Indexes instance, Instance large (or long) tablespace, Large objects (LOBs) large objects (LOBs), Large objects (LOBs) partition, Database partition partition group, Database partition schema, Schemas system catalog tables, System catalog tables system managed spaces (SMS), Tablespaces tables, Tables tablespaces, Tablespaces transaction logs, Transaction logs view, Views write ahead logging, Transaction logs automated backup utilities, Automatic maintenance backup command, The backup Command, Backup levels, Backup levels, Backup levels, Backup path and filenaming convention, Discovering the history of your backup operations delta backup, Backup levels full backup, Backup levels incremental backup, Backup levels pathname and filename, Backup path and filenaming convention recovery history file, Discovering the history of your backup operations crash recovery, Crash recovery health monitor, Automatic maintenance in-place version recovery, Performing an In-Place Version Recovery offline backup, Managing archive logs partial backups, Backup path and filenaming convention recover command, The recover Command redirected version recovery, Performing a Redirected Version Recovery restore command, The restore Command rollforward command, The rollforward Command rollforward recovery, Rollforward recovery, Performing a Rollforward Recovery, Performing a Rollforward Recovery point in time (PIT), Performing a Rollforward Recovery statistics, gathering, Reorganizing Data and Collecting Statistics version recovery, Version recovery DB2 UDB, IBM DB2 Backup and Recovery, Reorganizing Data and Collecting Statistics dd utility, Backups change atime, Backing Up and Restoring with the dd Utility, Basic dd Options, Using dd to Copy a File or Raw Device, Using dd to Convert Data, Using dd to Determine the Block Size of a Tape, Using dd to Figure out the Backup Format atime, changing, Backups change atime backup format, Using dd to Figure out the Backup Format convert data with, Using dd to Convert Data copy a file or raw device with, Using dd to Copy a File or Raw Device determine block size of a tape with, Using dd to Determine the Block Size of a Tape options, Basic dd Options de-duplication, De-Duplication Backup Systems, De-duplication backup systems, De-Duplication Backup Systems density versus compression, Density Versus Compression development procedures (backups), following proper, Following Proper Development Procedures differential backups, Backup Levels digital audio tape (DAT), DDS drive Digital Data Storage (DDS), DDS drive Digital Linear Tape (DLT), DLT drives (end-of-lifed) directories, cpio’s Restore Options, Torture-Testing Backup Programs, Pass I, Pass III cpio restore, making needed, cpio’s Restore Options data, writing (dump utility), Pass III dump utility, evaluating for backup, Pass I files, becoming (volatile filesystem backups), Torture-Testing Backup Programs disasters, types of, Be Ready for Anything: 10 Types of Disasters, Be Ready for Anything: 10 Types of Disasters disk drives, Be Ready for Anything: 10 Types of Disasters, Reliability failure, losing data from, Be Ready for Anything: 10 Types of Disasters versus tape media, Reliability disk mirroring, System Recovery and Disk Mirroring disk staging, Disk Targets disk-as-disk targets, Disk-As-Disk Targets disk-as-tape units (virtual tape libraries), Disk-As-Tape: Virtual Tape Libraries disk-based backup, You Want to Learn About Disk-Based Backup disk-to-disk-to-tape (D2D2T) backup, You Want to Learn About Disk-Based Backup disks left powered off, Re-presentation ditto utility, Backing Up and Restoring with the ditto Utility, Syntax of ditto When Backing Up, The Options to the ditto Command, Syntax of ditto when Restoring options, The Options to the ditto Command restoring with, Syntax of ditto when Restoring syntax, Syntax of ditto When Backing Up documentation, Take an Inventory, 12,000 gold pieces, Self-preservation: Document, document, document backups, 12,000 gold pieces importance in backup plan, Take an Inventory restore program, importance of, Self-preservation: Document, document, document downtime, cost of, What Will Downtime Cost You?

DB2, DB2 Architecture, Managing archive logs, Instance, Databases, Schemas, Tables, Views, Indexes, DB2 engine dispatch units, System catalog tables, Database partition, Database partition, Containers, Tablespaces, Tablespaces, Tablespaces, Large objects (LOBs), Large objects (LOBs), Transaction logs, Transaction logs, Managing archive logs, Managing archive logs, Managing archive logs, The backup Command, Backup levels, Backup levels, Backup levels, Backup path and filenaming convention, Backup path and filenaming convention, Discovering the history of your backup operations, Automatic maintenance, Automatic maintenance, Crash recovery, Version recovery, Rollforward recovery, The restore Command, The rollforward Command, The recover Command, Performing an In-Place Version Recovery, Performing a Redirected Version Recovery, Performing a Rollforward Recovery, Performing a Rollforward Recovery, Reorganizing Data and Collecting Statistics architecture, DB2 Architecture, Managing archive logs, Instance, Databases, Schemas, Tables, Views, Indexes, DB2 engine dispatch units, System catalog tables, Database partition, Database partition, Containers, Tablespaces, Tablespaces, Tablespaces, Large objects (LOBs), Large objects (LOBs), Transaction logs, Transaction logs, Managing archive logs, Managing archive logs archive logging, Managing archive logs circular logging, Managing archive logs container, Containers database, Databases database managed spaces (DMS), Tablespaces engine dispatch units (EDUs), DB2 engine dispatch units index, Indexes instance, Instance large (or long) tablespace, Large objects (LOBs) large objects (LOBs), Large objects (LOBs) partition, Database partition partition group, Database partition schema, Schemas system catalog tables, System catalog tables system managed spaces (SMS), Tablespaces tables, Tables tablespaces, Tablespaces transaction logs, Transaction logs view, Views write ahead logging, Transaction logs automated backup utilities, Automatic maintenance backup command, The backup Command, Backup levels, Backup levels, Backup levels, Backup path and filenaming convention, Discovering the history of your backup operations delta backup, Backup levels full backup, Backup levels incremental backup, Backup levels pathname and filename, Backup path and filenaming convention recovery history file, Discovering the history of your backup operations crash recovery, Crash recovery health monitor, Automatic maintenance in-place version recovery, Performing an In-Place Version Recovery offline backup, Managing archive logs partial backups, Backup path and filenaming convention recover command, The recover Command redirected version recovery, Performing a Redirected Version Recovery restore command, The restore Command rollforward command, The rollforward Command rollforward recovery, Rollforward recovery, Performing a Rollforward Recovery, Performing a Rollforward Recovery point in time (PIT), Performing a Rollforward Recovery statistics, gathering, Reorganizing Data and Collecting Statistics version recovery, Version recovery DB2 UDB, IBM DB2 Backup and Recovery, Reorganizing Data and Collecting Statistics dd utility, Backups change atime, Backing Up and Restoring with the dd Utility, Basic dd Options, Using dd to Copy a File or Raw Device, Using dd to Convert Data, Using dd to Determine the Block Size of a Tape, Using dd to Figure out the Backup Format atime, changing, Backups change atime backup format, Using dd to Figure out the Backup Format convert data with, Using dd to Convert Data copy a file or raw device with, Using dd to Copy a File or Raw Device determine block size of a tape with, Using dd to Determine the Block Size of a Tape options, Basic dd Options de-duplication, De-Duplication Backup Systems, De-duplication backup systems, De-Duplication Backup Systems density versus compression, Density Versus Compression development procedures (backups), following proper, Following Proper Development Procedures differential backups, Backup Levels digital audio tape (DAT), DDS drive Digital Data Storage (DDS), DDS drive Digital Linear Tape (DLT), DLT drives (end-of-lifed) directories, cpio’s Restore Options, Torture-Testing Backup Programs, Pass I, Pass III cpio restore, making needed, cpio’s Restore Options data, writing (dump utility), Pass III dump utility, evaluating for backup, Pass I files, becoming (volatile filesystem backups), Torture-Testing Backup Programs disasters, types of, Be Ready for Anything: 10 Types of Disasters, Be Ready for Anything: 10 Types of Disasters disk drives, Be Ready for Anything: 10 Types of Disasters, Reliability failure, losing data from, Be Ready for Anything: 10 Types of Disasters versus tape media, Reliability disk mirroring, System Recovery and Disk Mirroring disk staging, Disk Targets disk-as-disk targets, Disk-As-Disk Targets disk-as-tape units (virtual tape libraries), Disk-As-Tape: Virtual Tape Libraries disk-based backup, You Want to Learn About Disk-Based Backup disk-to-disk-to-tape (D2D2T) backup, You Want to Learn About Disk-Based Backup disks left powered off, Re-presentation ditto utility, Backing Up and Restoring with the ditto Utility, Syntax of ditto When Backing Up, The Options to the ditto Command, Syntax of ditto when Restoring options, The Options to the ditto Command restoring with, Syntax of ditto when Restoring syntax, Syntax of ditto When Backing Up documentation, Take an Inventory, 12,000 gold pieces, Self-preservation: Document, document, document backups, 12,000 gold pieces importance in backup plan, Take an Inventory restore program, importance of, Self-preservation: Document, document, document downtime, cost of, What Will Downtime Cost You?


Natural Language Processing with Python and spaCy by Yuli Vasiliev

Bayesian statistics, computer vision, data science, database schema, Easter island, en.wikipedia.org, loose coupling, natural language processing, Skype, statistical model

This time, you should get the following result: {'product': 'pizza', 'ptype': 'Greek', 'qty': 2} The value of the 'qty' field is now a digit, and we have a consistent format to send to the database. Preparing Your Database Environment To prepare your database environment, you need to install or obtain access to a database; create the components you’ll need in the database, such as a database schema, a table, and so on; and install a Python module that enables you to interact with the database. Although you can choose any database that can receive and process JSON data, such as an Oracle database, this section uses MySQL. The MySQL database has a long history of supporting the most popular data interchange formats, which are XML and JSON.


PostgreSQL Cookbook by Chitij Chauhan

database schema, Debian, fault tolerance, GnuPG, Google Glasses, index card, information security

This is useful for comparing what is stored in the database against the definitions in a data or object modeling tool. It is also useful to make sure that you can recreate objects in exactly the correct schema, tablespace, and database with the correct ownership and permissions. To back up all object definitions in all the databases, including roles, tablespaces, databases, schemas, tables, indexes, triggers, functions, constraints, views, ownership, and privileges, you can use the following command in Windows: pg_dumpall --schema-only > c:\pgdump\definitiononly.sql If you want to back up the role definition only, use the following command: pg_dumpall --roles-only > c:\pgdump\myroles.sql If you want to back up tablespace definitions, use the following command: pg_dumpall --tablespaces-only > c:\pgdump\mytablespaces.sql You can also use the pgAdmin tool to backup all the databases on the server, including the roles, users, groups, and tablespaces.


Learning Node.js: A Hands-On Guide to Building Web Applications in JavaScript by Marc Wandschneider

business logic, database schema, en.wikipedia.org, Firefox, functional programming, Google Chrome, node package manager, telemarketer, web application

Again, one of the trickier parts of Node.js is that it’s constantly under development and improvement; things are still changing quite a bit. You use the 2.x series in this book because it’s much more robust than what was there before. Creating a Schema for the Database When working with MySQL, you need to create a database schema for the application, which you place in schema.sql. The first part of this is to create a database with UTF-8 as the default character set and sorting order, as follows: Click here to view code image DROP DATABASE IF EXISTS PhotoAlbums; CREATE DATABASE PhotoAlbums DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci; USE PhotoAlbums; You then have to create a schema for your Users table, where you place information for registered users of your photo-sharing app.


pages: 420 words: 61,808

Flask Web Development: Developing Web Applications With Python by Miguel Grinberg

business logic, database schema, Firefox, full text search, information security, Minecraft, platform as a service, web application

Flask-SQLAlchemy creates database tables from models only when they do not exist already, so the only way to make it update tables is by destroying the old tables first, but of course this causes all the data in the database to be lost. A better solution is to use a database migration framework. In the same way source code version control tools keep track of changes to source code files, a database migration framework keeps track of changes to a database schema, and then incremental changes can be applied to the database. The lead developer of SQLAlchemy has written a migration framework called Alembic, but instead of using Alembic directly, Flask applications can use the Flask-Migrate extension, a lightweight Alembic wrapper that integrates with Flask-Script to provide all operations through Flask-Script commands.


pages: 232 words: 71,237

Kill It With Fire: Manage Aging Computer Systems by Marianne Bellotti

anti-pattern, barriers to entry, business logic, cloud computing, cognitive bias, computer age, continuous integration, create, read, update, delete, Daniel Kahneman / Amos Tversky, data science, database schema, Dennis Ritchie, DevOps, fault tolerance, fear of failure, Google Chrome, Hans Moravec, iterative process, Ken Thompson, loose coupling, microservices, minimum viable product, Multics, no silver bullet, off-by-one error, platform as a service, pull request, QWERTY keyboard, Richard Stallman, risk tolerance, Schrödinger's Cat, side project, software as a service, Steven Levy, systems thinking, web application, Y Combinator, Y2K

If you configure them to talk to the same database, they are tightly coupled (Figure 3-2). Figure 3-2: Tightly coupled services Such coupling creates a few potential problems. To begin with, any of the three services could make a change to the data that breaks the other two services. Any changes to the database schema have to be coordinated across all three services. By sharing a database, you lose the scaling benefit of having three separate services, because as load increases on one service, it is passed down to the database, and the other services see a dip in performance. However, giving each service its own database trades those problems for other potential problems.


pages: 315 words: 70,044

Learning SPARQL by Bob Ducharme

database schema, Donald Knuth, en.wikipedia.org, G4S, linked data, machine readable, semantic web, SPARQL, web application

If you’ve followed along with the examples in this chapter, then you’ve already used one of these combinations of an RDBMS and SPARQL middleware without knowing it: the Linked Moved Database stores its data using the MySQL database and uses the D2RQ server as a middleware layer. D2RQ is free, and once you give it read access to a relational database, it can read the database schema and generate the files it needs to map SPARQL queries to SQL for that database. (These include the resource and property names that you’ll use in your queries, and can be customized.) If the same D2RQ server has this kind of access to multiple databases, a single SPARQL query can ask for data from multiple databases at once, which is not possible with a standard SQL query.


pages: 602 words: 207,965

Practical Ext JS Projects With Gears by Frank Zammetti

a long time ago in a galaxy far, far away, Albert Einstein, corporate raider, create, read, update, delete, database schema, en.wikipedia.org, fake news, Firefox, full text search, Gordon Gekko, Kickstarter, Larry Wall, leftpad, loose coupling, Ronald Reagan, web application

UML class diagram of the DAO class The class starts out with the databaseName field that names the Gears database we’ll be using. The value in this case is TimekeeperExt. Following that we find 15 fields, the value of each of which is an SQL statement. There are three types of entities this application deals with: projects, tasks, and resources. The database schema is simple, as you can see in Figure 4-5, Figure 4-6, and Figure 4-7—one for each of the three tables corresponding to the three entities involved. C ha p ter 4 ■ M a K I N G p r O Je C t M a N a G e M e N t C O O L: t I M e K e e p e r e X t Figure 4-5. Table structure of the projects table Figure 4-6.

To do that we’ll have to play with Gears and SQLite a bit: var db = google.gears.factory.create("beta.database"); db.open(inDatabaseName); var rs = db.execute ( "SELECT name, sql FROM sqlite_master where type='table';" ); 393 394 Ch apt er 7 ■ YOU r Da D ha D a W Or K B eNC h , N O W S O D O Y O U : S Q L W O r K B e N C h Figure 7-10. The Tables Window Every SQLite database has a special read-only table named sqlite_master in it. The data in this table describes the database schema. The structure of this table is always what is shown in Table 7-1. Table 7-1. The Structure of the sqlite_master Table Field Description type Tells the type of entity the record describes. For our purposes, the only value we care about is table. name Gives you the name of the entity the record describes.


pages: 286 words: 87,401

Blitzscaling: The Lightning-Fast Path to Building Massively Valuable Companies by Reid Hoffman, Chris Yeh

"Susan Fowler" uber, activist fund / activist shareholder / activist investor, adjacent possible, Airbnb, Amazon Web Services, Andy Rubin, autonomous vehicles, Benchmark Capital, bitcoin, Blitzscaling, blockchain, Bob Noyce, business intelligence, Cambridge Analytica, Chuck Templeton: OpenTable:, cloud computing, CRISPR, crowdsourcing, cryptocurrency, Daniel Kahneman / Amos Tversky, data science, database schema, DeepMind, Didi Chuxing, discounted cash flows, Elon Musk, fake news, Firefox, Ford Model T, forensic accounting, fulfillment center, Future Shock, George Gilder, global pandemic, Google Hangouts, Google X / Alphabet X, Greyball, growth hacking, high-speed rail, hockey-stick growth, hydraulic fracturing, Hyperloop, initial coin offering, inventory management, Isaac Newton, Jeff Bezos, Joi Ito, Khan Academy, late fees, Lean Startup, Lyft, M-Pesa, Marc Andreessen, Marc Benioff, margin call, Mark Zuckerberg, Max Levchin, minimum viable product, move fast and break things, Network effects, Oculus Rift, oil shale / tar sands, PalmPilot, Paul Buchheit, Paul Graham, Peter Thiel, pre–internet, Quicken Loans, recommendation engine, ride hailing / ride sharing, Salesforce, Sam Altman, Sand Hill Road, Saturday Night Live, self-driving car, shareholder value, sharing economy, Sheryl Sandberg, Silicon Valley, Silicon Valley startup, Skype, smart grid, social graph, SoftBank, software as a service, software is eating the world, speech recognition, stem cell, Steve Jobs, subscription business, synthetic biology, Tesla Model S, thinkpad, three-martini lunch, transaction costs, transport as a service, Travis Kalanick, Uber for X, uber lyft, web application, winner-take-all economy, work culture , Y Combinator, yellow journalism

You see, Minna specializes in taking a successful software product and helping it go global. She has a very particular set of skills that she has acquired over a long career dating back to the dot-com era. She knows exactly what a software development and product team needs to do in order to make Internet software work in different languages and markets, in areas ranging from database schemas to user interface. She then works with a cross-functional team to implement these changes in advance of a global rollout. It’s not easy to find people who fit your needs so perfectly; you can’t just go on LinkedIn and filter by “preferred stage of blitzscaling.” (Though come to think of it, that might not be a bad idea…) You’ll probably have to rely on your network for recommendations, which is where your investors and board of directors can help.


pages: 314 words: 94,600

Business Metadata: Capturing Enterprise Knowledge by William H. Inmon, Bonnie K. O'Neil, Lowell Fryman

affirmative action, bioinformatics, business cycle, business intelligence, business process, call centre, carbon-based life, continuous integration, corporate governance, create, read, update, delete, database schema, en.wikipedia.org, folksonomy, informal economy, knowledge economy, knowledge worker, semantic web, tacit knowledge, The Wisdom of Crowds, web application

For more detail in this regard, see his book Data Model Patterns: Conventions of Thought1. It is therefore the consensus of the authors that an ER model constitutes business metadata because it communicates relationships between concepts in both a graphical and linguistic way. Parenthetically, ER models are also used to generate technical metadata (database schema), and to communicate relationships between data elements to technical people: database designers, database administrators (DBAs), and developers. 11.4.5 Conceptual Model, RDF and OWL, Topic Map, UML 11.4.5.1 Conceptual Model The techniques we have just discussed (glossaries, thesauri, and data models) are used to capture business metadata in ways that can be understood by human beings.


pages: 629 words: 83,362

Programming TypeScript by Boris Cherny

billion-dollar mistake, bitcoin, business logic, database schema, don't repeat yourself, duck typing, Firefox, functional programming, Internet of things, pull request, source of truth, SQL injection, type inference, web application

Backend Frameworks When you build an application that interacts with a database, you might start with raw SQL or API calls, which are inherently untyped: // PostgreSQL, using node-postgres let client = new Client let res = await client.query( 'SELECT name FROM users where id = $1', [739311] ) // any // MongoDB, using node-mongodb-native db.collection('users') .find({id: 739311}) .toArray((err, user) => // user is any ) With a bit of manual typing you can make these APIs safer and get rid of most of your anys: db.collection('users') .find({id: 739311}) .toArray((err, user: User) => // user is any ) However, raw SQL APIs are still fairly low-level, and it’s still easy to use the wrong type, or forget a type and accidentally end up with anys. That’s where object-relational mappers (ORMs) come in. ORMs generate code from your database schema, giving you high-level APIs to express queries, updates, deletions, and so on. In statically typed languages, these APIs are typesafe, so you don’t have to worry about typing things correctly and manually binding generic type parameters. When accessing your database from TypeScript, consider using an ORM.


pages: 422 words: 86,414

Hands-On RESTful API Design Patterns and Best Practices by Harihara Subramanian

blockchain, business logic, business process, cloud computing, continuous integration, create, read, update, delete, cyber-physical system, data science, database schema, DevOps, disruptive innovation, domain-specific language, fault tolerance, information security, Infrastructure as a Service, Internet of things, inventory management, job automation, Kickstarter, knowledge worker, Kubernetes, loose coupling, Lyft, machine readable, microservices, MITM: man-in-the-middle, MVC pattern, Salesforce, self-driving car, semantic web, single page application, smart cities, smart contracts, software as a service, SQL injection, supply-chain management, web application, WebSocket

Metrics This Java library acts as a powerful toolkit and offers ways to measure the behavior of components that are deployed in production. Combined with other prominent libraries, such as Ehcache and Graphite, Metrics stands as it provides full-stack visibility to our RESTful API and web applications. Liquibase Dropwizard includes the open source solution for managing revisions of database-schema scripts. Liquibase supports various types of database, and various file formats for defining the DB structure. The highlight of Liquibase is its ability to roll back changes (forward and backward) from a specific point. Other noteworthy features Before we get into its advantages and disadvantages, let's take a glimpse at other noteworthy libraries in Dropwizard: Logging: Logback and slf4j for performant and flexible logging Hibernate validator: Offers easy ways for user input validation, along with capabilities of generating i18n-friendly error messages Http interactions: Bundled with Apache HttpClient and Jersey client libraries, it helps low-level and high-level HTTP interactions with other web services JDBI: Provides a simple, comfortable, and straightforward way to establish Relational database connectivity with Java backends Templates: Dropwizard supports Freemarker and Mustache, the simplest templating systems for consumer or user-facing applications Benefits of Dropwizard As we are covering the advantages and disadvantages of each framework, let's touch upon them for Dropwizard as well, so that you can decide why or why not to use Dropwizard.


The Art of Scalability: Scalable Web Architecture, Processes, and Organizations for the Modern Enterprise by Martin L. Abbott, Michael T. Fisher

always be closing, anti-pattern, barriers to entry, Bernie Madoff, business climate, business continuity plan, business intelligence, business logic, business process, call centre, cloud computing, combinatorial explosion, commoditize, Computer Numeric Control, conceptual framework, database schema, discounted cash flows, Dunning–Kruger effect, en.wikipedia.org, fault tolerance, finite state, friendly fire, functional programming, hiring and firing, Infrastructure as a Service, inventory management, machine readable, new economy, OSI model, packet switching, performance metric, platform as a service, Ponzi scheme, power law, RFC: Request For Comment, risk tolerance, Rubik’s Cube, Search for Extraterrestrial Intelligence, SETI@home, shareholder value, Silicon Valley, six sigma, software as a service, the scientific method, transaction costs, Vilfredo Pareto, web application, Y2K

R OLLBACK C APABILITIES Rollback Technology Considerations We often hear during our discussions around the rollback insurance policy that clients in general agree that being able to roll back would be great but that it is technically not feasible for them. Our answer to this is that it is almost always possible; it just may not be possible with your current team, processes, or architecture. The most commonly cited reason for an inability to roll back in Web enabled platforms and back office IT systems is database schema incompatibility. The argument usually goes that for any major development effort, there may be significant changes to the schema resulting in an incompatibility with the way old and new data is stored. This modification may result in table relationships changing, candidate keys changing, table columns changing, tables added, tables merged, tables disaggregated, and tables removed.

X-axis implementations are limited by the aforementioned replication technology limitations and the size of data that is being replicated. In general, x-axis implementations do not scale well with data size and growth. The Y-Axis of the AKF Database Scale Cube The y-axis of the AKF Database Scale Cube represents a separation of data meaning within your database schema or data storage system. When discussing database scale, we are usually aligning data with a predetermined application y-axis split. The y-axis split addresses the monolithic nature of the data architecture by separating the data into schemas that have meaning relative to the applications performing work on that T HE Y-A XIS OF THE AKF D ATABASE S CALE C UBE data or reading from that data.


pages: 419 words: 102,488

Chaos Engineering: System Resiliency in Practice by Casey Rosenthal, Nora Jones

Amazon Web Services, Asilomar, autonomous vehicles, barriers to entry, blockchain, business continuity plan, business intelligence, business logic, business process, cloud computing, cognitive load, complexity theory, continuous integration, cyber-physical system, database schema, DevOps, fail fast, fault tolerance, hindsight bias, human-factors engineering, information security, Kanban, Kubernetes, leftpad, linear programming, loose coupling, microservices, MITM: man-in-the-middle, no silver bullet, node package manager, operational security, OSI model, pull request, ransomware, risk tolerance, scientific management, Silicon Valley, six sigma, Skype, software as a service, statistical model, systems thinking, the scientific method, value engineering, WebSocket

For example, in the absence of unit or functional testing, some code changes may warrant greater scrutiny via code review. Or at particular times (maybe during a daily peak of traffic, or the holiday season experienced by ecommerce sites) a more conservative rate of “ramp-up” for new code paths may be considered. Database schema changes, search index rebuilding, lower-layer network routing, and “cosmetic” markup changes are all examples of activities that carry different uncertainties, different consequences if things go “sideways,” and different contingencies to take when they do. Chaos Engineering is no different; to experiment effectively is to be context-sensitive.


pages: 462 words: 172,671

Clean Code: A Handbook of Agile Software Craftsmanship by Robert C. Martin

business logic, continuous integration, database schema, disinformation, domain-specific language, don't repeat yourself, Donald Knuth, en.wikipedia.org, Eratosthenes, finite state, G4S, Ignaz Semmelweis: hand washing, iterative process, place-making, Rubik’s Cube, web application

This is still duplication and should be addressed by using the TEMPLATE METHOD,4 or STRATEGY5 pattern. 4. [GOF]. 5. [GOF]. Indeed, most of the design patterns that have appeared in the last fifteen years are simply well-known ways to eliminate duplication. So too the Codd Normal Forms are a strategy for eliminating duplication in database schemae. OO itself is a strategy for organizing modules and eliminating duplication. Not surprisingly, so is structured programming. I think the point has been made. Find and eliminate duplication wherever you can. G6: Code at Wrong Level of Abstraction It is important to create abstractions that separate higher level general concepts from lower level detailed concepts.


Python Geospatial Development - Second Edition by Erik Westra

business logic, capital controls, database schema, Firefox, functional programming, Golden Gate Park, Google Earth, Mercator projection, natural language processing, openstreetmap, Silicon Valley, systems thinking, web application

Shoreline and lake boundaries (as well as other land-water boundaries such as islands within lakes) are readily available using the GSHHS shoreline database.City and town data can be found in two places: The GNIS Database (http://geonames.usgs.gov/domestic) provides official place-name data for the United States, while the GEOnet Names Server (http://earth-info.nga.mil/gns/html) provides similar data for the rest of the world. Looking at these data sources, we can start to design the database schema for the DISTAL system: Tip The level field in the shorelines table corresponds to the level value in the GSHHS database: a value of 1 represents a coastline, 2 represents a lake, 3 represents an island within a lake, and 4 represents a pond on an island in a lake. While this is very simple, it's enough to get us started.


pages: 629 words: 109,663

Docker in Action by Jeff Nickoloff, Stephen Kuenzli

air gap, Amazon Web Services, cloud computing, computer vision, continuous integration, database schema, Debian, end-to-end encryption, exponential backoff, fail fast, failed state, information security, Kubernetes, microservices, MITM: man-in-the-middle, peer-to-peer, software as a service, web application

We’ll use the long form to enhance databases.yml and add a volume to store the postgres data: version: "3.7" volumes: pgdata: # empty definition uses volume defaults services: postgres: image: dockerinaction/postgres:11-alpine volumes: - type: volume source: pgdata # The named volume above target: /var/lib/postgresql/data environment: POSTGRES_PASSWORD: example adminer: image: dockerinaction/adminer:4 ports: - 8080:8080 deploy: replicas: 1 # Scale down to 1 replica so you can test In this example, the file defines a volume named pgdata, and the postgres service mounts that volume at /var/lib/postgresql/data. That location is where the PostgreSQL software will store any database schema or data. Deploy the stack and inspect the results: docker stack deploy \ -c databases.yml \ --prune \ my-databases Run docker volume ls after the changes have been applied to verify that the operation was successful: DRIVER VOLUME NAME local my-databases_pgdata The name of the stack prefixes any resources created for it such as services or volumes.


pages: 395 words: 110,994

The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win by Gene Kim, Kevin Behr, George Spafford

air freight, anti-work, antiwork, Apollo 13, business intelligence, business process, centre right, cloud computing, continuous integration, dark matter, database schema, DevOps, fail fast, friendly fire, Gene Kranz, index card, information security, inventory management, Kanban, Lean Startup, shareholder value, systems thinking, Toyota Production System

It conflicts with some of the Phoenix database changes, so Chris’ guys are going to need to do some recoding.” “Shit,” I say. “Wait. Which Phoenix module?” “It’s one of Sarah’s projects that we released after the project freeze was lifted,” he replies. “It was before we put the kanban around Brent. It was a database schema change that slipped through the cracks.” I swear under my breath. Sarah again? Chris has a pinched expression on his face. “This is going to be tricky. We’re going to have to rename a bunch of database columns, which will affect, who knows, maybe hundreds of files. And all the support scripts.


pages: 444 words: 118,393

The Nature of Software Development: Keep It Simple, Make It Valuable, Build It Piece by Piece by Ron Jeffries

Amazon Web Services, anti-pattern, bitcoin, business cycle, business intelligence, business logic, business process, c2.com, call centre, cloud computing, continuous integration, Conway's law, creative destruction, dark matter, data science, database schema, deep learning, DevOps, disinformation, duck typing, en.wikipedia.org, fail fast, fault tolerance, Firefox, Hacker News, industrial robot, information security, Infrastructure as a Service, Internet of things, Jeff Bezos, Kanban, Kubernetes, load shedding, loose coupling, machine readable, Mars Rover, microservices, Minecraft, minimum viable product, MITM: man-in-the-middle, Morris worm, move fast and break things, OSI model, peer-to-peer lending, platform as a service, power law, ransomware, revision control, Ruby on Rails, Schrödinger's Cat, Silicon Valley, six sigma, software is eating the world, source of truth, SQL injection, systems thinking, text mining, time value of money, transaction costs, Turing machine, two-pizza team, web application, zero day

Adjust connection pool sizes and timeouts. Disable specific outbound integrations. Reload configuration. Start or stop accepting load. Feature toggles. Not every service will need all of these controls. They should give you a place to start, though. Many services also expose controls to update the database schema, or even to delete all data and reseed it. These are presumably helpful in test environments but extremely hazardous in production. These controls result from a breakdown in roles. Developers don’t trust operations to deploy the software and run the scripts correctly. Operations doesn’t allow developers to log in to the production machines to update the schemata.


pages: 779 words: 116,439

Test-Driven Development With Python by Harry J. W. Percival

business logic, continuous integration, database schema, Debian, DevOps, don't repeat yourself, duck typing, Firefox, loose coupling, MVC pattern, off-by-one error, platform as a service, pull request, web application, WebSocket

Test fixtures Test fixtures refers to test data that needs to be set up as a precondition before a test is run—often this means populating the database with some information, but as we’ve seen (with browser cookies), it can involve other types of preconditions. Avoid JSON fixtures Django makes it easy to save and restore data from the database in JSON format (and others) using the dumpdata and loaddata management commands. Most people recommend against using these for test fixtures, as they are painful to man‐ age when your database schema changes. Use the ORM, or a tool like factory_boy. Fixtures also have to work remotely LiveServerTestCase makes it easy to interact with the test database using the Django ORM for tests running locally. Interacting with the database on the staging server is not so straightforward—one solution is Django management commands, as I’ve shown, but you should explore what works for you, and be careful!


Scala in Action by Nilanjan Raychaudhuri

business logic, continuous integration, create, read, update, delete, database schema, domain-specific language, don't repeat yourself, duck typing, en.wikipedia.org, failed state, fault tolerance, functional programming, general-purpose programming language, higher-order functions, index card, Kanban, MVC pattern, type inference, web application

The following is how you represent a story class in Scala: class Story(val number: String, val title: String, val phase: String) To make this class work with Squeryl, you have to do a couple of simple setups. First you have to tell Squeryl you need a table that will store all the stories for you in the database. The way to do that in Squeryl is to create a subclass of org.squeryl.Schema. Think of this class as equivalent to a database schema where you’ll keep all the data definitions of an application. The following code defines the schema with a table called “STORIES” for your Story class: package com.kanban.models import org.squeryl._ object KanbanSchema extends Schema { val stories = table[Story]("STORIES") } Save this as the file KanbanSchema.scala under src/main/scala/com/kanban/models.


pages: 1,758 words: 342,766

Code Complete (Developer Best Practices) by Steve McConnell

Ada Lovelace, Albert Einstein, Buckminster Fuller, business logic, call centre, classic study, continuous integration, data acquisition, database schema, don't repeat yourself, Donald Knuth, fault tolerance, General Magic , global macro, Grace Hopper, haute cuisine, if you see hoof prints, think horses—not zebras, index card, inventory management, iterative process, Larry Wall, loose coupling, Menlo Park, no silver bullet, off-by-one error, Perl 6, place-making, premature optimization, revision control, Sapir-Whorf hypothesis, seminal paper, slashdot, sorting algorithm, SQL injection, statistical model, Tacoma Narrows Bridge, the Cathedral and the Bazaar, the scientific method, Thomas Kuhn: the structure of scientific revolutions, Turing machine, web application

A telecommunications organization went from 86 percent correct before reviewing code changes to 99.6 percent afterward (Perrott 2004). Adjust your approach depending on the risk level of the refactoring. Some refactorings are riskier than others. A refactoring like "Replace a magic number with a named constant" is relatively risk-free. Refactorings that involve class or routine interface changes, database schema changes, or changes to boolean tests, among others, tend to be more risky. For easier refactorings, you might streamline your refactoring process to do more than one refactoring at a time and to simply retest, without going through an official review. For riskier refactorings, err on the side of caution.

You can deal with proliferating software versions by using version-control tools for Source-code control Dependency control like that offered by the make utility associated with UNIX Project documentation versioning Relating project artifacts like requirements, code, and test cases so that when a requirement changes, you can find the code and tests that are affected Data Dictionaries A data dictionary is a database that describes all the significant data in a project. In many cases, the data dictionary focuses primarily on database schemas. On large projects, a data dictionary is also useful for keeping track of the hundreds or thousands of class definitions. On large team projects, it's useful for avoiding naming clashes. A clash might be a direct, syntactic clash, in which the same name is used twice, or it might be a more subtle clash (or gap) in which different names are used to mean the same thing or the same name is used to mean subtly different things.


Data Wrangling With Python: Tips and Tools to Make Your Life Easier by Jacqueline Kazil

Amazon Web Services, bash_history, business logic, cloud computing, correlation coefficient, crowdsourcing, data acquisition, data science, database schema, Debian, en.wikipedia.org, Fairphone, Firefox, Global Witness, Google Chrome, Hacker News, job automation, machine readable, Nate Silver, natural language processing, pull request, Ronald Reagan, Ruby on Rails, selection bias, social web, statistical model, web application, WikiLeaks

Setting up and getting data into a relational database can involve many steps, but if your datasets are complex with many different relationships, it shouldn’t take more than a few steps to figure out how to join them and get the information you desire. When building relational databases, spend time mapping out the relations and their attributes, similar to what we did with the friend database. What are the different types of data, and how can they be mapped to one another? In relational database schema, we figure out how we want to match data by thinking about how we will most often use the data. You want the queries you ask the database to be easy to answer. Because we thought we might use occupation to help identify a friend, we put the occupation_id in the friend-table. Another thing to note is there are several different kinds of relationships.


pages: 603 words: 141,814

Python for Unix and Linux System Administration by Noah Gift, Jeremy M. Jones

Amazon Web Services, bash_history, Bram Moolenaar, business logic, cloud computing, create, read, update, delete, database schema, Debian, distributed revision control, Firefox, functional programming, Guido van Rossum, industrial robot, inventory management, job automation, Mark Shuttleworth, MVC pattern, skunkworks, web application

You can see that we have simply removed the # character from the beginning of the line to include the admin URLs config file. Now that we have configured a database, added the admin and inventory applications, and added the admin interface to the URLs config file, we are ready to start defining the database schema. In Django, each application has its own schema definition. In each application directory, “inventory” in this case, there is a file named models.py that contains definitions for the tables and columns that your application will use. With Django, as well as many other web frameworks that rely on ORMs, it is possible to create and use a database without having to write a single SQL expression.


pages: 680 words: 157,865

Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design by Diomidis Spinellis, Georgios Gousios

Albert Einstein, barriers to entry, business intelligence, business logic, business process, call centre, continuous integration, corporate governance, database schema, Debian, domain-specific language, don't repeat yourself, Donald Knuth, duck typing, en.wikipedia.org, fail fast, fault tolerance, financial engineering, Firefox, Free Software Foundation, functional programming, general-purpose programming language, higher-order functions, iterative process, linked data, locality of reference, loose coupling, meta-analysis, MVC pattern, Neal Stephenson, no silver bullet, peer-to-peer, premature optimization, recommendation engine, Richard Stallman, Ruby on Rails, semantic web, smart cities, social graph, social web, SPARQL, Steve Jobs, Stewart Brand, Strategic Defense Initiative, systems thinking, the Cathedral and the Bazaar, traveling salesman, Turing complete, type inference, web application, zero-coupon bond

Second, the operations group greatly appreciated the easy deployment of new releases. Previous systems had required the studios to ship removable hard drives back and forth, with all the attendant logistics problems. Finally, having the update mechanism allowed us to focus on “just sufficient” database design. We did not peer into the crystal ball or overengineer the database schema. Instead, we just designed enough of the schema to support the current iteration. Immutable Data and Ubiquitous GUIDs In working with customers, the studio associate creates some compositions that use multiple photographs, inset into a design. These designs come from a design group at company headquarters.


pages: 719 words: 181,090

Site Reliability Engineering: How Google Runs Production Systems by Betsy Beyer, Chris Jones, Jennifer Petoff, Niall Richard Murphy

"Margaret Hamilton" Apollo, Abraham Maslow, Air France Flight 447, anti-pattern, barriers to entry, business intelligence, business logic, business process, Checklist Manifesto, cloud computing, cognitive load, combinatorial explosion, continuous integration, correlation does not imply causation, crowdsourcing, database schema, defense in depth, DevOps, en.wikipedia.org, exponential backoff, fail fast, fault tolerance, Flash crash, George Santayana, Google Chrome, Google Earth, if you see hoof prints, think horses—not zebras, information asymmetry, job automation, job satisfaction, Kubernetes, linear programming, load shedding, loose coupling, machine readable, meta-analysis, microservices, minimum viable product, MVC pattern, no silver bullet, OSI model, performance metric, platform as a service, proprietary trading, reproducible builds, revision control, risk tolerance, side project, six sigma, the long tail, the scientific method, Toyota Production System, trickle-down economics, warehouse automation, web application, zero day

In particular, some of this chapter’s authors were in charge of a portion of the serving system (shown in Figure 31-1) that continually extracts and processes data from the database, in order to generate a set of indexed files that are then loaded and served around the world. This system was distributed over several datacenters and used about 1,000 CPUs and 8 TB of RAM to index 100 TB of data every day. Figure 31-1. A generic ads serving system The migration was nontrivial: in addition to migrating to a new technology, the database schema was significantly refactored and simplified thanks to the ability of F1 to store and index protocol buffer data in table columns. The goal was to migrate the processing system so that it could produce an output perfectly identical to the existing system. This allowed us to leave the serving system untouched and to perform, from the user’s perspective, a seamless migration.


pages: 1,380 words: 190,710

Building Secure and Reliable Systems: Best Practices for Designing, Implementing, and Maintaining Systems by Heather Adkins, Betsy Beyer, Paul Blankinship, Ana Oprea, Piotr Lewandowski, Adam Stubblefield

air gap, anti-pattern, barriers to entry, bash_history, behavioural economics, business continuity plan, business logic, business process, Cass Sunstein, cloud computing, cognitive load, continuous integration, correlation does not imply causation, create, read, update, delete, cryptocurrency, cyber-physical system, database schema, Debian, defense in depth, DevOps, Edward Snowden, end-to-end encryption, exponential backoff, fault tolerance, fear of failure, general-purpose programming language, Google Chrome, if you see hoof prints, think horses—not zebras, information security, Internet of things, Kubernetes, load shedding, margin call, microservices, MITM: man-in-the-middle, NSO Group, nudge theory, operational security, performance metric, pull request, ransomware, reproducible builds, revision control, Richard Thaler, risk tolerance, self-driving car, single source of truth, Skype, slashdot, software as a service, source of truth, SQL injection, Stuxnet, the long tail, Turing test, undersea cable, uranium enrichment, Valgrind, web application, Y2K, zero day

You can consider each of the following to be a deployment: Pushing code: Issuing a command to cause a server to download and run a new binary Updating a Kubernetes Deployment object to pick up a new Docker image Booting a VM or physical machine, which loads initial software or firmware Updating configuration: Running a SQL command to change a database schema Updating a Kubernetes Deployment object to change a command-line flag Publishing a package or other data, which will be consumed by other users: Uploading a deb package to an apt repository Uploading a Docker image to a container registry Uploading an APK to the Google Play Store Post-deployment changes are out of scope for this chapter.


pages: 1,065 words: 229,099

Real World Haskell by Bryan O'Sullivan, John Goerzen, Donald Stewart, Donald Bruce Stewart

bash_history, database schema, Debian, distributed revision control, domain-specific language, duck typing, en.wikipedia.org, Firefox, functional programming, general-purpose programming language, Guido van Rossum, higher-order functions, job automation, Larry Wall, lateral thinking, level 1 cache, machine readable, p-value, panic early, plutocrats, revision control, sorting algorithm, SQL injection, transfer pricing, type inference, web application, Yochai Benkler

With the lazy version, results will be evaluated in chunks; with the strict version, all results are read up front, stored in RAM, and then printed. Database Metadata Sometimes it can be useful for a program to learn information about the database itself. For instance, a program may want to see what tables exist so that it can automatically create missing tables or upgrade the database schema. In some cases, a program may need to alter its behavior depending on the database backend in use. First, there is a getTables function that will obtain a list of defined tables in a database. You can also use the describeTable function, which will provide information about the defined columns in a given table.


pages: 982 words: 221,145

Ajax: The Definitive Guide by Anthony T. Holdener

AltaVista, Amazon Web Services, business logic, business process, centre right, Citizen Lab, Colossal Cave Adventure, create, read, update, delete, database schema, David Heinemeier Hansson, en.wikipedia.org, Firefox, full text search, game design, general-purpose programming language, Guido van Rossum, information retrieval, loose coupling, machine readable, MVC pattern, Necker cube, p-value, Ruby on Rails, SimCity, slashdot, social bookmarking, sorting algorithm, SQL injection, Wayback Machine, web application

The star schema is more popular than the snowflake schema, but you can find good information on both. Principles and Implementations of Datawarehousing by Rajiv Parida (Laxmi Publications) and The Art of SQL by Stéphane Faroult and Peter Robson (O’Reilly) are good places to start for information on database schemas. Other resources include Advanced Topics in Database Research by Keng Siau (Ed.) (Idea Group Publishing) and Oracle Essentials: Oracle Database 10g, Third Edition, by Rick Greenwald et al. (O’Reilly). Databases | 47 Object databases represent information in the form of objects, essentially in the same way as objects are used in object-oriented programming.


pages: 1,302 words: 289,469

The Web Application Hacker's Handbook: Finding and Exploiting Security Flaws by Dafydd Stuttard, Marcus Pinto

business logic, call centre, cloud computing, commoditize, database schema, defense in depth, easy for humans, difficult for computers, Firefox, information retrieval, information security, lateral thinking, machine readable, MITM: man-in-the-middle, MVC pattern, optical character recognition, Ruby on Rails, SQL injection, Turing test, Wayback Machine, web application

NoSQL data stores represent data using key/value mappings and do not rely on a fixed schema such as a conventional database table. Keys and values can be arbitrarily defined, and the format of the value generally is not relevant to the data store. A further feature of key/ value storage is that a value may be a data structure itself, allowing hierarchical storage, unlike the flat data structure inside a database schema. Chapter 9 ■ Attacking Data Stores 545 NoSQL advocates claim this has several advantages, mainly in handling very large data sets, where the data store's hierarchical structure can be optimized exactly as required to reduce the overhead in retrieving data sets. In these instances a conventional database may require complex cross-referencing of tables to retrieve information on behalf of an application.


pages: 2,054 words: 359,149

The Art of Software Security Assessment: Identifying and Preventing Software Vulnerabilities by Justin Schuh

address space layout randomization, Albert Einstein, Any sufficiently advanced technology is indistinguishable from magic, bash_history, business logic, business process, database schema, Debian, defense in depth, en.wikipedia.org, Firefox, information retrieval, information security, iterative process, Ken Thompson, loose coupling, MITM: man-in-the-middle, Multics, MVC pattern, off-by-one error, operational security, OSI model, RFC: Request For Comment, slashdot, SQL injection, web application

ViewState The ViewState, stored in a client-side cookie, contains information on form parameter content, control status, and other display-specific information. By default, ViewState is protected with a secure message digest by using a secret in the validationKey attribute of the machineKey field in web.config. However, some controls can be bound to data sources that reveal column and table names along with other potential database schema. To address this problem, ViewState can also be encrypted by setting the validation attribute to AES or DES and providing a value for decryptionKey. If ViewState isn’t encrypted, you can use one of many ViewState decoder tools to search for interesting information (a ViewState decoder is available from www.pluralsight.com/tools.aspx).


pages: 923 words: 516,602

The C++ Programming Language by Bjarne Stroustrup

combinatorial explosion, conceptual framework, database schema, Dennis Ritchie, distributed generation, Donald Knuth, fault tolerance, functional programming, general-purpose programming language, higher-order functions, index card, iterative process, job-hopping, L Peter Deutsch, locality of reference, Menlo Park, no silver bullet, Parkinson's law, premature optimization, sorting algorithm

These techniques work well in specific areas where there is either a sound theoretical foundation (e.g., math, state machines, and relational databases) or where a general framework exists into which small application fragments can be embedded (e.g., graphical user interfaces, network simulations, and database schema). The obvious usefulness of these techniques in limited – and typically crucial – areas can tempt people to think that the elimination of traditional programming by these techniques is ‘‘just around the corner.’’ It is not. The reason is that expanding specification techniques outside areas with sound theoretical frameworks implies that the complexity of a general-purpose programming language would be needed in the specification language.