duck typing

27 results back to index


pages: 752 words: 131,533

Python for Data Analysis by Wes McKinney

Alignment Problem, backtesting, Bear Stearns, cognitive dissonance, crowdsourcing, data science, Debian, duck typing, Firefox, functional programming, Google Chrome, Guido van Rossum, index card, machine readable, random walk, recommendation engine, revision control, sentiment analysis, Sharpe ratio, side project, sorting algorithm, statistical model, type inference

search method, Regular expressions, Regular expressions, Regular expressions, Regular expressions searchsorted method, numpy.searchsorted: Finding elements in a Sorted Array, numpy.searchsorted: Finding elements in a Sorted Array, numpy.searchsorted: Finding elements in a Sorted Array seed function, Random Number Generation seek method, Files and the operating system semantics, Language Semantics–Mutable and immutable objects, Indentation, not braces–Indentation, not braces, Everything is an object–Everything is an object, Comments–Comments, Function and object method calls–Function and object method calls, Function and object method calls–Function and object method calls, Variables and pass-by-reference–Variables and pass-by-reference, Variables and pass-by-reference–Variables and pass-by-reference, Dynamic references, strong types–Dynamic references, strong types, Attributes and methods–Attributes and methods, “Duck” typing–“Duck” typing, Imports–Imports, Binary operators and comparisons–Binary operators and comparisons, Strictness versus laziness–Strictness versus laziness, Mutable and immutable objects–Mutable and immutable objects attributes in, Attributes and methods–Attributes and methods comments in, Comments–Comments “duck” typing, “Duck” typing–“Duck” typing functions in, Function and object method calls–Function and object method calls import directive, Imports–Imports indentation, Indentation, not braces–Indentation, not braces methods in, Function and object method calls–Function and object method calls mutable objects in, Mutable and immutable objects–Mutable and immutable objects object model, Everything is an object–Everything is an object operators for, Binary operators and comparisons–Binary operators and comparisons references in, Variables and pass-by-reference–Variables and pass-by-reference strict evaluation, Strictness versus laziness–Strictness versus laziness strongly-typed language, Dynamic references, strong types–Dynamic references, strong types variables in, Variables and pass-by-reference–Variables and pass-by-reference semicolons, Indentation, not braces sentinels, Handling Missing Data, Reading and Writing Data in Text Format sep argument, Reading and Writing Data in Text Format sequence functions, Built-in Sequence Functions–reversed, enumerate–enumerate, sorted–sorted, zip–zip, reversed–reversed enumerate function, enumerate–enumerate reversed function, reversed–reversed sorted function, sorted–sorted zip function, zip–zip Series data structure, Introduction to pandas Data Structures, Series–Series, Operations between DataFrame and Series–Operations between DataFrame and Series, Grouping with Dicts and Series–Grouping with Dicts and Series arithmetic operations between DataFrame and, Operations between DataFrame and Series–Operations between DataFrame and Series grouping with, Grouping with Dicts and Series–Grouping with Dicts and Series set comprehensions, List, Set, and Dict Comprehensions–Nested list comprehensions set function, Set setattr function, Attributes and methods setdefault method, Default values setdiff1d method, Unique and Other Set Logic sets/set comprehensions, Set–Set setxor1d method, Unique and Other Set Logic set_index function, Using a DataFrame’s Columns, Using a DataFrame’s Columns set_index method, Pivoting “long” to “wide” Format set_title method, Setting the title, axis labels, ticks, and ticklabels set_trace function, Other ways to make use of the debugger, Other ways to make use of the debugger, Other ways to make use of the debugger set_value method, Indexing, selection, and filtering set_xlabel method, Setting the title, axis labels, ticks, and ticklabels set_xlim method, Ticks, Labels, and Legends set_xticklabels method, Setting the title, axis labels, ticks, and ticklabels set_xticks method, Setting the title, axis labels, ticks, and ticklabels shapefiles, Plotting Maps: Visualizing Haiti Earthquake Crisis Data shapes, The NumPy ndarray: A Multidimensional Array Object, ndarray Object Internals sharex option, Figures and Subplots, Line Plots sharey option, Figures and Subplots, Line Plots shell commands in IPython, Shell Commands and Aliases–Shell Commands and Aliases shifting in time series data, Shifting (Leading and Lagging) Data–Shifting dates with offsets shortcuts, keyboard, Keyboard Shortcuts–Keyboard Shortcuts, Keyboard Shortcuts, Keyboard Shortcuts, Keyboard Shortcuts for deleting text, Keyboard Shortcuts for IPython, Keyboard Shortcuts–Keyboard Shortcuts shuffle function, Random Number Generation sign function, Universal Functions: Fast Element-wise Array Functions, Detecting and Filtering Outliers signal frontier analysis, Signal Frontier Analysis–Signal Frontier Analysis sin function, Universal Functions: Fast Element-wise Array Functions sinh function, Universal Functions: Fast Element-wise Array Functions size method, GroupBy Mechanics skew method, Summarizing and Computing Descriptive Statistics skipinitialspace option, Manually Working with Delimited Formats skipna method, Summarizing and Computing Descriptive Statistics skipna option, Summarizing and Computing Descriptive Statistics skiprows argument, Reading and Writing Data in Text Format skip_footer argument, Reading and Writing Data in Text Format slice method, Vectorized string functions in pandas slicing, Basic Indexing and Slicing–Indexing with slices, Slicing–Slicing arrays, Basic Indexing and Slicing–Indexing with slices lists, Slicing–Slicing Social Security Administration (SSA), US Baby Names 1880-2010 solve function, Linear Algebra sort argument, Database-style DataFrame Merges sort method, Sorting, More About Sorting, More About Sorting, Sorting, Anonymous (lambda) Functions sorted function, sorted–sorted, sorted, sorted sorting, Sorting–Sorting, Sorting and ranking–Sorting and ranking, Reordering and Sorting Levels–Reordering and Sorting Levels, More About Sorting–numpy.searchsorted: Finding elements in a Sorted Array, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort, Alternate Sort Algorithms–Alternate Sort Algorithms, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array, Sorting–Sorting arrays, Sorting–Sorting finding elements in sorted array, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array in NumPy, More About Sorting–numpy.searchsorted: Finding elements in a Sorted Array, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort, Alternate Sort Algorithms–Alternate Sort Algorithms, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array algorithms for, Alternate Sort Algorithms–Alternate Sort Algorithms finding elements in sorted array, numpy.searchsorted: Finding elements in a Sorted Array–numpy.searchsorted: Finding elements in a Sorted Array indirect sorts, Indirect Sorts: argsort and lexsort–Indirect Sorts: argsort and lexsort in pandas, Sorting and ranking–Sorting and ranking levels, Reordering and Sorting Levels–Reordering and Sorting Levels lists, Sorting–Sorting sortlevel function, Reordering and Sorting Levels, Reordering and Sorting Levels sort_columns argument, Line Plots sort_index method, Sorting and ranking, Reordering and Sorting Levels, Indirect Sorts: argsort and lexsort spaces, structuring code with, Indentation, not braces–Indentation, not braces spacing around subplots, Adjusting the spacing around subplots–Adjusting the spacing around subplots span, Exponentially-weighted functions specialized frequencies, Operations with Time Series of Different Frequencies–Using periods instead of timestamps data munging for, Operations with Time Series of Different Frequencies–Using periods instead of timestamps split method, Manually Working with Delimited Formats, String Object Methods, Regular expressions, Vectorized string functions in pandas, Concatenating and Splitting Arrays split-apply-combine, GroupBy Mechanics splitting arrays, Concatenating and Splitting Arrays–Stacking helpers: r_ and c_ SQL databases, Interacting with Databases sql module, Interacting with Databases SQLite databases, Interacting with Databases sqrt function, Universal Functions: Fast Element-wise Array Functions, Universal Functions: Fast Element-wise Array Functions square function, Universal Functions: Fast Element-wise Array Functions squeeze argument, Reading and Writing Data in Text Format SSA (Social Security Administration), US Baby Names 1880-2010 stable sorting, Alternate Sort Algorithms stacked format, Pivoting “long” to “wide” Format start index, Slicing, Slicing startswith method, String Object Methods, Vectorized string functions in pandas, Vectorized string functions in pandas statistical methods, Mathematical and Statistical Methods–Mathematical and Statistical Methods std method, Mathematical and Statistical Methods, Summarizing and Computing Descriptive Statistics, Data Aggregation stdout, Writing Data Out to Text Format step index, Slicing stop index, Slicing, Slicing strftime method, Converting between string and datetime, Dates and times strict evaluation/language, Strictness versus laziness–Strictness versus laziness, Strictness versus laziness strides/strided view, ndarray Object Internals, ndarray Object Internals strings, Data Types for ndarrays, Data Types for ndarrays, String Manipulation–Vectorized string functions in pandas, String Object Methods–String Object Methods, Regular expressions–Regular expressions, Vectorized string functions in pandas–Vectorized string functions in pandas, Converting between string and datetime–Converting between string and datetime, Strings–Strings converting to datetime, Converting between string and datetime–Converting between string and datetime data types for, Data Types for ndarrays, Data Types for ndarrays, Strings–Strings manipulating, String Manipulation–Vectorized string functions in pandas, String Object Methods–String Object Methods, Regular expressions–Regular expressions, Vectorized string functions in pandas–Vectorized string functions in pandas methods for, String Object Methods–String Object Methods vectorized string methods, Vectorized string functions in pandas–Vectorized string functions in pandas with regular expressions, Regular expressions–Regular expressions strip method, String Object Methods, Vectorized string functions in pandas, Vectorized string functions in pandas strongly-typed languages, Dynamic references, strong types–Dynamic references, strong types, Dynamic references, strong types strptime method, Converting between string and datetime, Dates and times structs, Structured and Record Arrays structured arrays, Structured and Record Arrays–Structured Array Manipulations: numpy.lib.recfunctions, Structured and Record Arrays, Nested dtypes and Multidimensional Fields–Nested dtypes and Multidimensional Fields, Why Use Structured Arrays?

beta function, Random Number Generation, Group Factor Exposures defined, Group Factor Exposures between_time method, Time of Day and “as of” Data Selection bfill method, Reindexing bin edges, Downsampling binary data formats, Storing Arrays on Disk in Binary Format–Storing Arrays on Disk in Binary Format, Binary Data Formats–Reading Microsoft Excel Files, Using HDF5 Format–Using HDF5 Format, Reading Microsoft Excel Files–Reading Microsoft Excel Files HDF5, Using HDF5 Format–Using HDF5 Format Microsoft Excel files, Reading Microsoft Excel Files–Reading Microsoft Excel Files storing arrays in, Storing Arrays on Disk in Binary Format–Storing Arrays on Disk in Binary Format binary moving window functions, Binary Moving Window Functions–Binary Moving Window Functions binary search of lists, Binary search and maintaining a sorted list–Binary search and maintaining a sorted list binary universal functions, Universal Functions: Fast Element-wise Array Functions binding, Variables and pass-by-reference, Closures: Functions that Return Functions defined, Variables and pass-by-reference variables, Closures: Functions that Return Functions binomial function, Random Number Generation bisect module, Binary search and maintaining a sorted list, Binary search and maintaining a sorted list bookmarking directories in IPython, Directory Bookmark System–Directory Bookmark System Boolean, Data Types for ndarrays, Boolean Indexing–Boolean Indexing, Methods for Boolean Arrays–Methods for Boolean Arrays, Booleans–Booleans arrays, Methods for Boolean Arrays–Methods for Boolean Arrays data type, Data Types for ndarrays, Booleans–Booleans indexing for arrays, Boolean Indexing–Boolean Indexing bottleneck library, Moving Window Functions braces ({}), Dict brackets ([]), Tuple, List break keyword, for loops broadcasting, Basic Indexing and Slicing, Repeating Elements: Tile and Repeat, Broadcasting–Setting Array Values by Broadcasting, Broadcasting, Broadcasting Over Other Axes–Broadcasting Over Other Axes, Setting Array Values by Broadcasting–Setting Array Values by Broadcasting defined, Basic Indexing and Slicing, Repeating Elements: Tile and Repeat, Broadcasting over other axes, Broadcasting Over Other Axes–Broadcasting Over Other Axes setting array values by, Setting Array Values by Broadcasting–Setting Array Values by Broadcasting bucketing, Bucketing Donation Amounts–Bucketing Donation Amounts C calendar module, Date and Time Data Types and Tools casting, Data Types for ndarrays cat method, Reading and Writing Data in Text Format, Vectorized string functions in pandas Categorical object, Discretization and Binning ceil function, Universal Functions: Fast Element-wise Array Functions center method, Vectorized string functions in pandas Chaco, Chaco–Chaco chisquare function, Random Number Generation chunksize argument, Reading and Writing Data in Text Format, Reading Text Files in Pieces, Reading Text Files in Pieces clearing screen shortcut, Keyboard Shortcuts clipboard, executing code from, Executing Code from the Clipboard–IPython interaction with editors and IDEs clock function, Timing Code: %time and %timeit close method, A Brief matplotlib API Primer, A Brief matplotlib API Primer, Files and the operating system closures, Closures: Functions that Return Functions–Closures: Functions that Return Functions cmd.exe, Windows collections module, Default values colons, Indentation, not braces cols option, Pivot Tables and Cross-Tabulation columns, grouping on, Selecting a Column or Subset of Columns–Selecting a Column or Subset of Columns column_stack function, Concatenating and Splitting Arrays combinations function, itertools module combine_first method, Combining and Merging Data Sets, Combining Data with Overlap, Combining Data with Overlap combining, Combining Data with Overlap–Combining Data with Overlap, Splicing Together Data Sources–Splicing Together Data Sources, Concatenating and combining lists–Concatenating and combining lists data sources, Splicing Together Data Sources–Splicing Together Data Sources data sources, with overlap, Combining Data with Overlap–Combining Data with Overlap lists, Concatenating and combining lists–Concatenating and combining lists commands, Keyboard Shortcuts, Using the Command History–Logging the Input and Output, Searching and Reusing the Command History–Searching and Reusing the Command History, Input and Output Variables–Input and Output Variables, Logging the Input and Output–Logging the Input and Output, Interactive Debugger, Interactive Debugger (see also magic commands) debugger, Interactive Debugger history in IPython, Using the Command History–Logging the Input and Output, Searching and Reusing the Command History–Searching and Reusing the Command History, Input and Output Variables–Input and Output Variables, Logging the Input and Output–Logging the Input and Output input and output variables, Input and Output Variables–Input and Output Variables logging of, Logging the Input and Output–Logging the Input and Output reusing command history, Searching and Reusing the Command History–Searching and Reusing the Command History searching for, Keyboard Shortcuts comment argument, Reading and Writing Data in Text Format comments in Python, Comments–Comments compile method, Regular expressions, Regular expressions complex128 data type, Data Types for ndarrays complex256 data type, Data Types for ndarrays complex64 data type, Data Types for ndarrays concat function, US Baby Names 1880-2010, Combining and Merging Data Sets, Merging on Index, Concatenating Along an Axis, Concatenating Along an Axis, Apply: General split-apply-combine, Concatenating and Splitting Arrays, Concatenating and Splitting Arrays, Concatenating and Splitting Arrays concatenating, Concatenating Along an Axis–Concatenating Along an Axis, Concatenating and Splitting Arrays–Stacking helpers: r_ and c_ along axis, Concatenating Along an Axis–Concatenating Along an Axis arrays, Concatenating and Splitting Arrays–Stacking helpers: r_ and c_ conditional logic as array operation, Expressing Conditional Logic as Array Operations–Expressing Conditional Logic as Array Operations conferences, Community and Conferences configuring matplotlib, matplotlib Configuration–matplotlib Configuration conforming, Reindexing contains method, Vectorized string functions in pandas contiguous memory, The Importance of Contiguous Memory–The Importance of Contiguous Memory continue keyword, for loops continuous return, Future Contract Rolling convention argument, Resampling and Frequency Conversion converting, Converting between string and datetime–Converting between string and datetime, Converting Timestamps to Periods (and Back)–Converting Timestamps to Periods (and Back) between string and datetime, Converting between string and datetime–Converting between string and datetime timestamps to periods, Converting Timestamps to Periods (and Back)–Converting Timestamps to Periods (and Back) coordinated universal time (UTC), Time Zone Handling copy argument, Database-style DataFrame Merges copy method, DataFrame copysign function, Universal Functions: Fast Element-wise Array Functions corr method, Correlation and Covariance, Correlation and Covariance correlation, Correlation and Covariance–Correlation and Covariance corrwith method, Correlation and Covariance cos function, Universal Functions: Fast Element-wise Array Functions cosh function, Universal Functions: Fast Element-wise Array Functions count method, Summarizing and Computing Descriptive Statistics, String Object Methods, Vectorized string functions in pandas, Data Aggregation, Tuple methods Counter class, Counting Time Zones in Pure Python cov method, Correlation and Covariance, Correlation and Covariance covariance, Correlation and Covariance–Correlation and Covariance CPython, Installation and Setup cross-section, Financial and Economic Data Applications crosstab function, Cross-Tabulations: Crosstab–Cross-Tabulations: Crosstab crowdsourcing, Plotting Maps: Visualizing Haiti Earthquake Crisis Data CSV files, Manually Working with Delimited Formats–Manually Working with Delimited Formats, Plotting Maps: Visualizing Haiti Earthquake Crisis Data Ctrl-A keyboard shortcut, Keyboard Shortcuts Ctrl-B keyboard shortcut, Keyboard Shortcuts Ctrl-C keyboard shortcut, Keyboard Shortcuts Ctrl-E keyboard shortcut, Keyboard Shortcuts Ctrl-F keyboard shortcut, Keyboard Shortcuts Ctrl-K keyboard shortcut, Keyboard Shortcuts Ctrl-L keyboard shortcut, Keyboard Shortcuts Ctrl-N keyboard shortcut, Keyboard Shortcuts Ctrl-P keyboard shortcut, Keyboard Shortcuts Ctrl-R keyboard shortcut, Keyboard Shortcuts Ctrl-Shift-V keyboard shortcut, Keyboard Shortcuts Ctrl-U keyboard shortcut, Keyboard Shortcuts cummax method, Summarizing and Computing Descriptive Statistics cummin method, Summarizing and Computing Descriptive Statistics cumprod method, Mathematical and Statistical Methods, Summarizing and Computing Descriptive Statistics cumsum method, Mathematical and Statistical Methods, Summarizing and Computing Descriptive Statistics cumulative returns, Return Indexes and Cumulative Returns–Return Indexes and Cumulative Returns currying, Currying: Partial Argument Application–Currying: Partial Argument Application, Currying: Partial Argument Application cursor, moving with keyboard, Keyboard Shortcuts custom universal functions, Custom ufuncs–Custom ufuncs cut function, Discretization and Binning, Discretization and Binning, Discretization and Binning, Discretization and Binning, Discretization and Binning, Quantile and Bucket Analysis, Bucketing Donation Amounts Cython project, Python as Glue, Other Speed Options: Cython, f2py, C–Other Speed Options: Cython, f2py, C c_ object, Stacking helpers: r_ and c_–Stacking helpers: r_ and c_ D data aggregation, Data Aggregation–Returning Aggregated Data in “unindexed” Form, Column-wise and Multiple Function Application–Column-wise and Multiple Function Application, Returning Aggregated Data in “unindexed” Form–Returning Aggregated Data in “unindexed” Form returning data in unindexed form, Returning Aggregated Data in “unindexed” Form–Returning Aggregated Data in “unindexed” Form using multiple functions, Column-wise and Multiple Function Application–Column-wise and Multiple Function Application data alignment, Arithmetic and data alignment–Operations between DataFrame and Series, Arithmetic methods with fill values–Arithmetic methods with fill values, Operations between DataFrame and Series–Operations between DataFrame and Series arithmetic methods with fill values, Arithmetic methods with fill values–Arithmetic methods with fill values operations between DataFrame and Series, Operations between DataFrame and Series–Operations between DataFrame and Series data munging, Data Munging Topics–Return Indexes and Cumulative Returns, Time Series and Cross-Section Alignment–Time Series and Cross-Section Alignment, Operations with Time Series of Different Frequencies–Using periods instead of timestamps, Time of Day and “as of” Data Selection–Time of Day and “as of” Data Selection, Splicing Together Data Sources–Splicing Together Data Sources asof method, Time of Day and “as of” Data Selection–Time of Day and “as of” Data Selection combining data, Splicing Together Data Sources–Splicing Together Data Sources for data alignment, Time Series and Cross-Section Alignment–Time Series and Cross-Section Alignment for specialized frequencies, Operations with Time Series of Different Frequencies–Using periods instead of timestamps data structures for pandas, Introduction to pandas Data Structures–Index Objects, Series–Series, DataFrame–DataFrame, Index Objects–Index Objects, Panel Data–Panel Data DataFrame, DataFrame–DataFrame Index objects, Index Objects–Index Objects Panel, Panel Data–Panel Data Series, Series–Series data types, Data Types for ndarrays–Data Types for ndarrays, Data Types for ndarrays–Data Types for ndarrays, Date and Time Data Types and Tools–Converting between string and datetime, Converting between string and datetime–Converting between string and datetime, ndarray Object Internals–NumPy dtype Hierarchy, NumPy dtype Hierarchy–NumPy dtype Hierarchy, Nested dtypes and Multidimensional Fields–Nested dtypes and Multidimensional Fields, Scalar Types–Dates and times, Numeric types–Numeric types, Strings–Strings, Booleans–Booleans, Type casting–Type casting, None–None, Dates and times–Dates and times for arrays, Data Types for ndarrays–Data Types for ndarrays for ndarray, Data Types for ndarrays–Data Types for ndarrays for NumPy, ndarray Object Internals–NumPy dtype Hierarchy, NumPy dtype Hierarchy–NumPy dtype Hierarchy hierarchy of, NumPy dtype Hierarchy–NumPy dtype Hierarchy for Python, Scalar Types–Dates and times, Numeric types–Numeric types, Strings–Strings, Booleans–Booleans, Type casting–Type casting, None–None, Dates and times–Dates and times boolean data type, Booleans–Booleans dates and times, Dates and times–Dates and times None data type, None–None numeric data types, Numeric types–Numeric types str data type, Strings–Strings type casting in, Type casting–Type casting for time series data, Date and Time Data Types and Tools–Converting between string and datetime, Converting between string and datetime–Converting between string and datetime converting between string and datetime, Converting between string and datetime–Converting between string and datetime nested, Nested dtypes and Multidimensional Fields–Nested dtypes and Multidimensional Fields data wrangling, Combining and Merging Data Sets–Combining Data with Overlap, Database-style DataFrame Merges–Database-style DataFrame Merges, Merging on Index–Merging on Index, Concatenating Along an Axis–Concatenating Along an Axis, Combining Data with Overlap–Combining Data with Overlap, Reshaping with Hierarchical Indexing–Reshaping with Hierarchical Indexing, Pivoting “long” to “wide” Format–Pivoting “long” to “wide” Format, Data Transformation–Computing Indicator/Dummy Variables, Removing Duplicates–Removing Duplicates, Transforming Data Using a Function or Mapping–Transforming Data Using a Function or Mapping, Replacing Values–Replacing Values, Renaming Axis Indexes–Renaming Axis Indexes, Discretization and Binning–Discretization and Binning, Detecting and Filtering Outliers–Detecting and Filtering Outliers, Permutation and Random Sampling–Permutation and Random Sampling, Computing Indicator/Dummy Variables–Computing Indicator/Dummy Variables, String Manipulation–Vectorized string functions in pandas, String Object Methods–String Object Methods, Regular expressions–Regular expressions, Vectorized string functions in pandas–Vectorized string functions in pandas, Example: USDA Food Database–Example: USDA Food Database manipulating strings, String Manipulation–Vectorized string functions in pandas, String Object Methods–String Object Methods, Regular expressions–Regular expressions, Vectorized string functions in pandas–Vectorized string functions in pandas methods for, String Object Methods–String Object Methods vectorized string methods, Vectorized string functions in pandas–Vectorized string functions in pandas with regular expressions, Regular expressions–Regular expressions merging data, Combining and Merging Data Sets–Combining Data with Overlap, Database-style DataFrame Merges–Database-style DataFrame Merges, Merging on Index–Merging on Index, Concatenating Along an Axis–Concatenating Along an Axis, Combining Data with Overlap–Combining Data with Overlap combining data with overlap, Combining Data with Overlap–Combining Data with Overlap concatenating along axis, Concatenating Along an Axis–Concatenating Along an Axis DataFrame merges, Database-style DataFrame Merges–Database-style DataFrame Merges on index, Merging on Index–Merging on Index pivoting, Pivoting “long” to “wide” Format–Pivoting “long” to “wide” Format reshaping, Reshaping with Hierarchical Indexing–Reshaping with Hierarchical Indexing transforming data, Data Transformation–Computing Indicator/Dummy Variables, Removing Duplicates–Removing Duplicates, Transforming Data Using a Function or Mapping–Transforming Data Using a Function or Mapping, Replacing Values–Replacing Values, Renaming Axis Indexes–Renaming Axis Indexes, Discretization and Binning–Discretization and Binning, Detecting and Filtering Outliers–Detecting and Filtering Outliers, Permutation and Random Sampling–Permutation and Random Sampling, Computing Indicator/Dummy Variables–Computing Indicator/Dummy Variables discretization, Discretization and Binning–Discretization and Binning dummy variables, Computing Indicator/Dummy Variables–Computing Indicator/Dummy Variables filtering outliers, Detecting and Filtering Outliers–Detecting and Filtering Outliers mapping, Transforming Data Using a Function or Mapping–Transforming Data Using a Function or Mapping permutation, Permutation and Random Sampling–Permutation and Random Sampling removing duplicates, Removing Duplicates–Removing Duplicates renaming axis indexes, Renaming Axis Indexes–Renaming Axis Indexes replacing values, Replacing Values–Replacing Values USDA food database example, Example: USDA Food Database–Example: USDA Food Database databases, Interacting with Databases–Storing and Loading Data in MongoDB reading and writing to, Interacting with Databases–Storing and Loading Data in MongoDB DataFrame data structure, Counting Time Zones with pandas, MovieLens 1M Data Set, Introduction to pandas Data Structures, DataFrame–DataFrame, Operations between DataFrame and Series–Operations between DataFrame and Series, Using a DataFrame’s Columns–Using a DataFrame’s Columns, Database-style DataFrame Merges–Database-style DataFrame Merges arithmetic operations between Series and, Operations between DataFrame and Series–Operations between DataFrame and Series hierarchical indexing using, Using a DataFrame’s Columns–Using a DataFrame’s Columns merging data with, Database-style DataFrame Merges–Database-style DataFrame Merges dates and times, Index Objects, Reading and Writing Data in Text Format, Date and Time Data Types and Tools, Date and Time Data Types and Tools, Converting between string and datetime–Converting between string and datetime, Converting between string and datetime, Generating Date Ranges–Generating Date Ranges, Generating Date Ranges, Generating Date Ranges, Generating Date Ranges, Scalar Types, Dates and times–Dates and times, Dates and times (see also time series data) data types for, Date and Time Data Types and Tools, Dates and times–Dates and times date ranges, Generating Date Ranges–Generating Date Ranges datetime type, Converting between string and datetime–Converting between string and datetime, Scalar Types, Dates and times DatetimeIndex Index object, Index Objects dateutil package, Converting between string and datetime date_parser argument, Reading and Writing Data in Text Format date_range function, Generating Date Ranges, Generating Date Ranges, Generating Date Ranges dayfirst argument, Reading and Writing Data in Text Format debug function, Other ways to make use of the debugger, Other ways to make use of the debugger debugger, IPython, Interactive Debugger–Other ways to make use of the debugger in IPython, Interactive Debugger–Other ways to make use of the debugger def keyword, Functions defaults, Profiles and Configuration, Default values–Default values profiles, Profiles and Configuration values for dicts, Default values–Default values del keyword, Input and Output Variables, DataFrame, Dict delete method, Index Objects delimited formats, Manually Working with Delimited Formats–Manually Working with Delimited Formats density plots, Histograms and Density Plots–Histograms and Density Plots describe method, Summarizing and Computing Descriptive Statistics, Summarizing and Computing Descriptive Statistics, Plotting Maps: Visualizing Haiti Earthquake Crisis Data, Apply: General split-apply-combine design tips, Code Design Tips–Overcome a fear of longer files, Keep relevant objects and data alive–Keep relevant objects and data alive, Flat is better than nested–Flat is better than nested, Overcome a fear of longer files–Overcome a fear of longer files flat is better than nested, Flat is better than nested–Flat is better than nested keeping relevant objects and data alive, Keep relevant objects and data alive–Keep relevant objects and data alive overcoming fear of longer files, Overcome a fear of longer files–Overcome a fear of longer files det function, Linear Algebra development tools in IPython, Software Development Tools–Profiling a Function Line-by-Line, Interactive Debugger–Other ways to make use of the debugger, Timing Code: %time and %timeit–Timing Code: %time and %timeit, Basic Profiling: %prun and %run -p–Basic Profiling: %prun and %run -p, Profiling a Function Line-by-Line–Profiling a Function Line-by-Line debugger, Interactive Debugger–Other ways to make use of the debugger profiling code, Basic Profiling: %prun and %run -p–Basic Profiling: %prun and %run -p profiling function line-by-line, Profiling a Function Line-by-Line–Profiling a Function Line-by-Line timing code, Timing Code: %time and %timeit–Timing Code: %time and %timeit diag function, Linear Algebra dicts, Interacting with the Operating System, Grouping with Dicts and Series–Grouping with Dicts and Series, Dict–Valid dict key types, Creating dicts from sequences–Creating dicts from sequences, Default values–Default values, Valid dict key types–Valid dict key types, List, Set, and Dict Comprehensions–Nested list comprehensions creating, Creating dicts from sequences–Creating dicts from sequences default values for, Default values–Default values dict comprehensions, List, Set, and Dict Comprehensions–Nested list comprehensions grouping on, Grouping with Dicts and Series–Grouping with Dicts and Series keys for, Valid dict key types–Valid dict key types returning system environment variables as, Interacting with the Operating System diff method, Index Objects, Summarizing and Computing Descriptive Statistics difference method, Set digitize function, numpy.searchsorted: Finding elements in a Sorted Array directories, Interacting with the Operating System, Directory Bookmark System–Directory Bookmark System bookmarking in IPython, Directory Bookmark System–Directory Bookmark System changing, commands for, Interacting with the Operating System discretization, Discretization and Binning–Discretization and Binning div method, Arithmetic methods with fill values divide function, Universal Functions: Fast Element-wise Array Functions .dmg file, Apple OS X donation statistics, Donation Statistics by Occupation and Employer–Donation Statistics by Occupation and Employer, Donation Statistics by State–Donation Statistics by State by occupation and employer, Donation Statistics by Occupation and Employer–Donation Statistics by Occupation and Employer by state, Donation Statistics by State–Donation Statistics by State dot function, Linear Algebra, Linear Algebra, NumPy Matrix Class doublequote option, Manually Working with Delimited Formats downsampling, Resampling and Frequency Conversion dpi (dots-per-inch) option, Saving Plots to File, Saving Plots to File dreload function, Reloading Module Dependencies drop method, Index Objects, Dropping entries from an axis–Dropping entries from an axis, Dropping entries from an axis dropna method, Handling Missing Data drop_duplicates method, Removing Duplicates dsplit function, Concatenating and Splitting Arrays dstack function, Concatenating and Splitting Arrays dtype object, What Is This Book About? (see data types) “duck” typing in Python, “Duck” typing–“Duck” typing dummy variables, Computing Indicator/Dummy Variables–Computing Indicator/Dummy Variables dumps function, JSON Data duplicated method, Removing Duplicates, Removing Duplicates duplicates, Removing Duplicates–Removing Duplicates, Time Series with Duplicate Indices–Time Series with Duplicate Indices indices, Time Series with Duplicate Indices–Time Series with Duplicate Indices removing from data, Removing Duplicates–Removing Duplicates dynamically-generated functions, Closures: Functions that Return Functions E edgecolo option, Saving Plots to File edit-compile-run workflow, IPython: An Interactive Computing and Development Environment eig function, Linear Algebra elif blocks, What Is This Book About?

Python 3, Python 2 and Python 3–Python 2 and Python 3 required libraries, Essential Python Libraries–SciPy, NumPy–NumPy, pandas–pandas, matplotlib–matplotlib, IPython–IPython, SciPy–SciPy IPython, IPython–IPython matplotlib, matplotlib–matplotlib NumPy, NumPy–NumPy pandas, pandas–pandas SciPy, SciPy–SciPy semantics of, Language Semantics–Mutable and immutable objects, Indentation, not braces–Indentation, not braces, Everything is an object–Everything is an object, Comments–Comments, Function and object method calls–Function and object method calls, Function and object method calls–Function and object method calls, Variables and pass-by-reference–Variables and pass-by-reference, Variables and pass-by-reference–Variables and pass-by-reference, Dynamic references, strong types–Dynamic references, strong types, Attributes and methods–Attributes and methods, “Duck” typing–“Duck” typing, Imports–Imports, Binary operators and comparisons–Binary operators and comparisons, Strictness versus laziness–Strictness versus laziness, Mutable and immutable objects–Mutable and immutable objects attributes in, Attributes and methods–Attributes and methods comments in, Comments–Comments functions in, Function and object method calls–Function and object method calls import directive, Imports–Imports indentation, Indentation, not braces–Indentation, not braces methods in, Function and object method calls–Function and object method calls mutable objects in, Mutable and immutable objects–Mutable and immutable objects object model, Everything is an object–Everything is an object operators for, Binary operators and comparisons–Binary operators and comparisons references in, Variables and pass-by-reference–Variables and pass-by-reference strict evaluation, Strictness versus laziness–Strictness versus laziness strongly-typed language, Dynamic references, strong types–Dynamic references, strong types variables in, Variables and pass-by-reference–Variables and pass-by-reference “duck” typing, “Duck” typing–“Duck” typing sequence functions in, Built-in Sequence Functions–reversed, enumerate–enumerate, sorted–sorted, zip–zip, reversed–reversed enumerate function, enumerate–enumerate reversed function, reversed–reversed sorted function, sorted–sorted zip function, zip–zip set comprehensions in, List, Set, and Dict Comprehensions–Nested list comprehensions sets in, Set–Set setting up, Installation and Setup–Integrated Development Environments (IDEs), Windows–Windows, Apple OS X–Apple OS X, GNU/Linux–GNU/Linux on Linux, GNU/Linux–GNU/Linux on OS X, Apple OS X–Apple OS X on Windows, Windows–Windows tuples in, Tuple–Tuple methods, Unpacking tuples–Unpacking tuples, Tuple methods–Tuple methods methods for, Tuple methods–Tuple methods unpacking, Unpacking tuples–Unpacking tuples pytz library, Time Zone Handling, Time Zone Handling, Time Zone Handling Q qcut method, Discretization and Binning, Discretization and Binning, Discretization and Binning, Quantile and Bucket Analysis, Quantile and Bucket Analysis, Decile and Quartile Analysis qr function, Linear Algebra Qt console for IPython, Qt-based Rich GUI Console–Qt-based Rich GUI Console quantile analysis, Quantile and Bucket Analysis–Quantile and Bucket Analysis quarterly periods, Quarterly Period Frequencies–Quarterly Period Frequencies quartile analysis, Decile and Quartile Analysis–Decile and Quartile Analysis question mark (?)


pages: 936 words: 85,745

Programming Ruby 1.9: The Pragmatic Programmer's Guide by Dave Thomas, Chad Fowler, Andy Hunt

book scanning, David Heinemeier Hansson, Debian, domain-specific language, duck typing, Jacquard loom, Kickstarter, Neal Stephenson, off-by-one error, p-value, revision control, Ruby on Rails, slashdot, sorting algorithm, web application

We didn’t have to use a string—for the object we’re testing here, an array would work just as well: Download samples/ducktyping_6.rb require 'test/unit' require 'addcust' class TestAddCustomer < Test::Unit::TestCase def test_add c = Customer.new("Ima", "Customer") f = [] c.append_name_to_file(f) assert_equal(["Ima", " ", "Customer"], f) end end produces: Finished in 0.000405 seconds. 1 tests, 1 assertions, 0 failures, 0 errors, 0 skips Indeed, this form may be more convenient if we wanted to check that the correct individual things were inserted. So, duck typing is convenient for testing, but what about in the body of applications themselves? Well, it turns out that the same thing that made the tests easy in the previous example also makes it easy to write flexible application code. In fact, Dave had an interesting experience where duck typing dug him (and a client) out of a hole. He’d written a large Ruby-based web application that (among other things) kept a Report erratum C LASSES A REN ’ T T YPES 374 database table full of details of participants in a competition.

This meant that the intermediate lines were still referenced and hence were no longer garbage. It also meant that we were no longer building an ever-growing string that forced garbage collection. Thanks to duck typing, the change was trivial: def csv_from_row(op, row) # as before end result = [] query.each_row {|row| csv_from_row(result, row)} http.write result.join All that changed is that we passed an array into the csv_from_row method. Because it (implicitly) used duck typing, the method itself was not modified; it continued to append Report erratum C ODING LIKE A D UCK 375 the data it generated to its parameter, not caring what type that parameter was.

But let’s go wild and implement addition for Roman numbers: Download samples/ducktyping_24.rb class Roman MAX_ROMAN = 4999 attr_reader :value protected :value def initialize(value) if value <= 0 || value > MAX_ROMAN fail "Roman values must be > 0 and <= #{MAX_ROMAN}" end @value = value end Report erratum S TANDARD P ROTOCOLS AND C OERCIONS 382 def coerce(other) if Integer === other [ other, @value ] else [ Float(other), Float(@value) ] end end def +(other) if Roman === other other = other.value end if Fixnum === other && (other + @value) < MAX_ROMAN Roman.new(@value + other) else x, y = other.coerce(@value) x + y end end FACTORS = [["m", 1000], ["cm", 900], ["d", ["c", 100], ["xc", 90], ["l", ["x", 10], ["ix", 9], ["v", ["i", 1]] 500], ["cd", 400], 50], ["xl", 40], 5], ["iv", 4], def to_s value = @value roman = "" for code, factor in FACTORS count, value = value.divmod(factor) roman << (code * count) end roman end end Download samples/ducktyping_25.rb iv = Roman.new(4) xi = Roman.new(11) iv iv iv xi xi + + + + + 3 3 + 4 3.14159 4900 4990 # # # # # => => => => => vii xi 7.14159 mmmmcmxi 5001 Finally, be careful with coerce—try always to coerce into a more general type, or you may end up generating coercion loops. This is a situation where A tries to coerce to B, and B tries to coerce back to A. Report erratum WALK THE WALK , TALK THE TALK 383 Walk the Walk, Talk the Talk Duck typing can generate controversy. Every now and then a thread flares on the mailing lists or someone blogs for or against the concept. Many of the contributors to these discussions have some fairly extreme positions. Ultimately, though, duck typing isn’t a set of rules; it’s just a style of programming. Design your programs to balance paranoia and flexibility. If you feel the need to constrain the types of objects that the users of a method pass in, ask yourself why.


Exploring Everyday Things with R and Ruby by Sau Sheong Chang

Alfred Russel Wallace, bioinformatics, business process, butterfly effect, cloud computing, Craig Reynolds: boids flock, data science, Debian, duck typing, Edward Lorenz: Chaos theory, Gini coefficient, income inequality, invisible hand, p-value, price stability, Ruby on Rails, Skype, statistical model, stem cell, Stephen Hawking, text mining, The Wealth of Nations by Adam Smith, We are the 99%, web application, wikimedia commons

For example, in Java, you need to first declare a variable, then assign it to a value: int count = 100; However, in Ruby, you only need to do this: count = 100 You are expected to use the variable properly—that is, if you placed an integer into the variable, you’re expected to use it as an integer in your code. When you use count, Ruby knows that it’s an integer and you’re expected to use it as such. However, if you don’t, Ruby will automatically cast it to whatever you’re trying to use it for. This process is known as duck typing. The idea behind duck typing comes from the duck test: “if it walks like a duck, and quacks like a duck, then it is a duck.” What this means is that the type of the object is not determined by the class of the object. Instead, the type depends on what the object can do. A simple example goes like this. Let’s say we define a method named op: def op(a,b) a << b end The method takes two parameters and returns a single value.

No problem here: x = 'hello ' y = 'world' op(x,y) => 'hello world' If x is an array and y is a string, the method appends y into the x, returning an array: x = ['hello'] y = 'world' op(x,y) => ["hello", "world"] If x and y are integers, the method will perform a left-shift bitwise operation, moving binary 1 two positions to the left, resulting in 4: x = 1 y = 2 op(x,y) => 4 So what does this mean? There are both benefits and drawbacks to duck typing. The most obvious drawback is that we have a method that is inconsistent: if we put different values into the method, we can get wildly different results, and this is not checked anytime before the actual running of the program. However, the major benefit of this approach is that it results in much simpler code. If you know what you’re doing, it can lead to code that is easier to read and to maintain. Ultimately, duck typing is more of a philosophy than a fixed way of coding in Ruby. If you want to ensure that the op method you defined can be used only for strings, for example, you can always do this: def op(a,b) throw "Input parameters to op must be string" unless a.is_a?

: (question mark, colon), in Ruby ternary conditional expression, if and unless > (right angle bracket), The R Console, Variables and Functions -> assignment operator, R, Variables and Functions > R console prompt, The R Console ' ' (single quotes), enclosing Ruby strings, Strings [ ] (square brackets), Vectors, Matrices, Data frames accessing subset of R data frame, Data frames enclosing R matrix indexes, Matrices enclosing R vector indexes, Vectors [[ ]] (square brackets, double), enclosing single R vector index, Vectors A aes() function, R, Aesthetics An Inquiry into the Nature and Causes of the Wealth of Nations (University of Chicago Press), The Invisible Hand apply() function, R, Interpreting the Data Armchair Economist (Free Press), How to Be an Armchair Economist array() function, R, Arrays arrays, R, Arrays–Arrays arrays, Ruby, Arrays and hashes–Arrays and hashes, Arrays and hashes artificial society, Money (see Utopia example) as.Date() function, R, Number of Messages by Day of the Month ascultation, Auscultation assignment operators, R, Variables and Functions at sign, double (@@), preceding Ruby class variables, Class methods and variables attr keyword, Ruby, Classes and objects Audacity audio editor, Homemade Digital Stethoscope average, Interpreting the Data (see mean() function, R) Axtell, Robert (researcher), It’s a Good Life Growing Artificial Societies: Social Science from the Bottom Up (Brookings Institution Press/MIT Press), It’s a Good Life B backticks (` `), enclosing R operators as functions, Variables and Functions bar charts, Plotting charts, Interpreting the Data–Interpreting the Data, The Second Simulation–The Second Simulation, The Third Simulation–The Third Simulation, The Final Simulation–The Final Simulation barplot() function, R, Plotting charts batch mode, R, Sourcing Files and the Command Line Bioconductor repository, Packages birds flocking, Schooling Fish and Flocking Birds (see flocking example) bmp() function, R, Basic Graphs Boids algorithm, Schooling Fish and Flocking Birds–The Origin of Boids Box, George Edward Pelham (statistician), regarding usefulness of models, The Simple Scenario break keyword, R, Conditionals and Loops brew command, Installing Ruby using your platform’s package management tool butterfly effect, The Changes C c() function, R, Vectors CALO Project, The Emailing Habits of Enron Executives camera, pulse oximeter using, Homemade Pulse Oximeter case expression, Ruby, case expression chaos theory, The Changes charts, Charting–Adjustments, Plotting charts, Statistical transformation, Geometric object, Interpreting the Data–Interpreting the Data, Interpreting the Data–Interpreting the Data, Interpreting the Data–Interpreting the Data, The Second Simulation, The Second Simulation–The Second Simulation, The Third Simulation–The Third Simulation, The Third Simulation–The Third Simulation, The Final Simulation–The Final Simulation, The Final Simulation–The Final Simulation, Analyzing the Simulation–Analyzing the Simulation, Analyzing the Second Simulation–Analyzing the Second Simulation, Number of Messages by Day of the Month–Number of Messages by Hour of the Day, Generating the Heart Sounds Waveform–Generating the Heart Sounds Waveform, Generating the Heartbeat Waveform and Calculating the Heart Rate–Generating the Heartbeat Waveform and Calculating the Heart Rate, Money–Money, Money–Money, Implementation bar charts, Plotting charts, Interpreting the Data–Interpreting the Data, The Second Simulation–The Second Simulation, The Third Simulation–The Third Simulation, The Final Simulation–The Final Simulation histograms, Statistical transformation, Geometric object, Money–Money line charts, Interpreting the Data–Interpreting the Data, Analyzing the Simulation–Analyzing the Simulation, Analyzing the Second Simulation–Analyzing the Second Simulation Lorenz curves, Money–Money scatterplots, Interpreting the Data–Interpreting the Data, The Second Simulation, The Third Simulation–The Third Simulation, The Final Simulation–The Final Simulation, Number of Messages by Day of the Month–Number of Messages by Hour of the Day, Implementation waveforms, Generating the Heart Sounds Waveform–Generating the Heart Sounds Waveform, Generating the Heartbeat Waveform and Calculating the Heart Rate–Generating the Heartbeat Waveform and Calculating the Heart Rate class methods, Ruby, Class methods and variables class variables, Ruby, Class methods and variables–Class methods and variables classes, R, Programming R classes, Ruby, Classes and objects–Classes and objects code examples, Using Code Examples (see example applications) colon (:), Symbols, Vectors creating R vectors, Vectors preceding Ruby symbols, Symbols comma-separated value (CSV) files, Importing data from text files (see CSV files) Comprehensive R Archive Network (CRAN), Packages conditionals, R, Conditionals and Loops conditionals, Ruby, Conditionals and loops–case expression contact information for this book, How to Contact Us conventions used in this book, Conventions Used in This Book cor() function, R, The R Console Core library, Ruby, Requiring External Libraries corpus, Text Mining correlation, R, The R Console CRAN (Comprehensive R Archive Network), Packages CSV (comma-separated value) files, Importing data from text files, The First Simulation–The First Simulation, The First Simulation, Interpreting the Data, The Simulation, Extracting Data from Sound–Extracting Data from Sound, Extracting Data from Video extracting video data to, Extracting Data from Video extracting WAV data to, Extracting Data from Sound–Extracting Data from Sound reading data from, Interpreting the Data writing data to, The First Simulation–The First Simulation, The Simulation csv library, Ruby, The First Simulation, The Simulation, Grab and Parse curl utility, Ruby Version Manager (RVM) D data, Data, Data, Everywhere–Data, Data, Everywhere, Bringing the World to Us, Importing Data–Importing data from a database, Importing data from text files, The First Simulation–The First Simulation, Interpreting the Data, How to Be an Armchair Economist, The Simulation, Grab and Parse–Grab and Parse, The Emailing Habits of Enron Executives–The Emailing Habits of Enron Executives, Homemade Digital Stethoscope–Extracting Data from Sound, Extracting Data from Sound–Extracting Data from Sound, Homemade Pulse Oximeter–Extracting Data from Video, Extracting Data from Video analyzing, Data, Data, Everywhere–Data, Data, Everywhere, Bringing the World to Us, How to Be an Armchair Economist charts for, How to Be an Armchair Economist (see charts) obstacles to, Data, Data, Everywhere–Data, Data, Everywhere simulations for, Bringing the World to Us (see simulations) audio, from stethoscope, Homemade Digital Stethoscope–Extracting Data from Sound CSV files for, Importing data from text files, The First Simulation–The First Simulation, Interpreting the Data, The Simulation, Extracting Data from Sound–Extracting Data from Sound, Extracting Data from Video from Enron, The Emailing Habits of Enron Executives–The Emailing Habits of Enron Executives from Gmail, Grab and Parse–Grab and Parse importing, R, Importing Data–Importing data from a database video, from pulse oximeter, Homemade Pulse Oximeter–Extracting Data from Video data frames, R, Data frames–Data frames data mining, The Idea data.frame() function, R, Data frames database, importing data from, Importing data from a database–Importing data from a database dbConnect() function, R, Importing data from a database dbGet() function, R, Importing data from a database DBI packages, R, Importing data from a database–Importing data from a database Debian system, installing Ruby on, Installing Ruby using your platform’s package management tool def keyword, Ruby, Classes and objects dimnames() function, R, Matrices distribution, normal, Money dollar sign ($), preceding R list item names, Lists doodling example, Shoes doodler–Shoes doodler double quotes (" "), enclosing Ruby strings, Strings duck typing, Ruby, Code like a duck–Code like a duck dynamic typing, Ruby, Code like a duck–Code like a duck E economics example, A Simple Market Economy–A Simple Market Economy, The Producer–The Producer, The Consumer–The Consumer, Some Convenience Methods–Some Convenience Methods, The Simulation–The Simulation, Analyzing the Simulation–Analyzing the Simulation, The Producer–The Producer, The Consumer–The Consumer, Market–Market, The Simulation–The Simulation, Analyzing the Second Simulation–Analyzing the Second Simulation, Price Controls–Price Controls charts for, Analyzing the Simulation–Analyzing the Simulation, Analyzing the Second Simulation–Analyzing the Second Simulation Consumer class for, The Consumer–The Consumer, The Consumer–The Consumer Market class for, Some Convenience Methods–Some Convenience Methods, Market–Market modeling, A Simple Market Economy–A Simple Market Economy price controls analysis, Price Controls–Price Controls Producer class for, The Producer–The Producer, The Producer–The Producer simulations for, The Simulation–The Simulation, The Simulation–The Simulation email example, Grab and Parse–Grab and Parse, The Emailing Habits of Enron Executives–The Emailing Habits of Enron Executives, Number of Messages by Day of the Month–Number of Messages by Day of the Month, Number of Messages by Day of the Month–Number of Messages by Hour of the Day, MailMiner–MailMiner, Number of Messages by Day of Week–Number of Messages by Hour of the Day, Interactions–Comparative Interactions, Text Mining–Text Mining charts for, Number of Messages by Day of the Month–Number of Messages by Hour of the Day content of messages, analyzing, Text Mining–Text Mining data for, Grab and Parse–Grab and Parse Enron data for, The Emailing Habits of Enron Executives–The Emailing Habits of Enron Executives interactions in email, analyzing, Interactions–Comparative Interactions number of messages, analyzing, Number of Messages by Day of the Month–Number of Messages by Day of the Month, Number of Messages by Day of Week–Number of Messages by Hour of the Day R package for, creating, MailMiner–MailMiner emergent behavior, The Origin of Boids (see also flocking example) Enron Corporation scandal, The Emailing Habits of Enron Executives Epstein, Joshua (researcher), It’s a Good Life Growing Artificial Societies: Social Science from the Bottom Up (Brookings Institution Press/MIT Press), It’s a Good Life equal sign (=), assignment operator, R, Variables and Functions Euclidean distance, Roids evolution, Evolution example applications, Using Code Examples, Shoes stopwatch–Shoes stopwatch, Shoes doodler–Shoes doodler, The R Console–Sourcing Files and the Command Line, Data frames–Introducing ggplot2, qplot–qplot, Statistical transformation–Geometric object, Adjustments–Adjustments, Offices and Restrooms, A Simple Market Economy, Grab and Parse, My Beating Heart, Schooling Fish and Flocking Birds, Money artificial utopian society, Money (see Utopia example) birds flocking, Schooling Fish and Flocking Birds (see flocking example) doodling, Shoes doodler–Shoes doodler economics, A Simple Market Economy (see economics example) email, Grab and Parse (see email example) fuel economy, qplot–qplot, Adjustments–Adjustments heartbeat, My Beating Heart (see heartbeat example) height and weight, The R Console–Sourcing Files and the Command Line league table, Data frames–Introducing ggplot2 movie database, Statistical transformation–Geometric object permission to use, Using Code Examples restrooms, Offices and Restrooms (see restrooms example) stopwatch, Shoes stopwatch–Shoes stopwatch expressions, R, Programming R external libraries, Ruby, Requiring External Libraries–Requiring External Libraries F factor() function, R, Factors, Text Mining factors, R, Factors–Factors FFmpeg library, Extracting Data from Video, Extracting Data from Video field of vision (FOV), Roids fish, schools of, Schooling Fish and Flocking Birds (see flocking example) flocking example, Schooling Fish and Flocking Birds–The Origin of Boids, The Origin of Boids, Simulation–Simulation, Roids–Roids, The Boid Flocking Rules–Putting in Obstacles, The Boid Flocking Rules–The Boid Flocking Rules, A Variation on the Rules–A Variation on the Rules, Going Round and Round–Going Round and Round, Putting in Obstacles–Putting in Obstacles Boids algorithm for, Schooling Fish and Flocking Birds–The Origin of Boids centering path for, Going Round and Round–Going Round and Round obstacles in path for, Putting in Obstacles–Putting in Obstacles research regarding, A Variation on the Rules–A Variation on the Rules Roid class for, Roids–Roids rules for, The Origin of Boids, The Boid Flocking Rules–The Boid Flocking Rules simulations for, Simulation–Simulation, The Boid Flocking Rules–Putting in Obstacles flows, Shoes, Shoes stopwatch fonts used in this book, Conventions Used in This Book–Conventions Used in This Book for loop, R, Conditionals and Loops format() function, R, Number of Messages by Day of the Month FOV (field of vision), Roids fuel economy example, qplot–qplot, Adjustments–Adjustments function class, R, Programming R functions, R, Variables and Functions–Variables and Functions G GAM (generalized addictive model), The Changes gem command, Ruby, Requiring External Libraries .gem file extension, Requiring External Libraries generalized addictive model (GAM), The Changes Gentleman, Robert (creator of R), Introducing R geom_bar() function, R, Interpreting the Data, The Second Simulation, The Final Simulation geom_histogram() function, R, Geometric object geom_line() function, R, Analyzing the Simulation geom_point() function, R, Plot, Interpreting the Data, Generating the Heart Sounds Waveform geom_smooth() function, R, Interpreting the Data ggplot() function, R, Plot ggplot2 package, R, Introducing ggplot2–Adjustments Gini coefficient, Money Git utility, Ruby Version Manager (RVM) Gmail, retrieving message data from, Grab and Parse–Grab and Parse graphics device, opening, Basic Graphs graphics package, R, Basic Graphs graphs, Charting (see charts) Growing Artificial Societies: Social Science from the Bottom Up (Brookings Institution Press/MIT Press), It’s a Good Life H hash mark, curly brackets (#{ }), enclosing Ruby string escape sequences, Strings hashes, Ruby, Arrays and hashes–Arrays and hashes heart, diagram of, Generating the Heart Sounds Waveform heartbeat example, My Beating Heart, My Beating Heart, My Beating Heart, Homemade Digital Stethoscope, Homemade Digital Stethoscope, Homemade Digital Stethoscope–Extracting Data from Sound, Generating the Heart Sounds Waveform–Generating the Heart Sounds Waveform, Generating the Heart Sounds Waveform, Finding the Heart Rate–Finding the Heart Rate, Homemade Pulse Oximeter–Homemade Pulse Oximeter, Homemade Pulse Oximeter–Extracting Data from Video, Generating the Heartbeat Waveform and Calculating the Heart Rate–Generating the Heartbeat Waveform and Calculating the Heart Rate, Generating the Heartbeat Waveform and Calculating the Heart Rate–Generating the Heartbeat Waveform and Calculating the Heart Rate charts for, Generating the Heart Sounds Waveform–Generating the Heart Sounds Waveform, Generating the Heartbeat Waveform and Calculating the Heart Rate–Generating the Heartbeat Waveform and Calculating the Heart Rate data for, Homemade Digital Stethoscope–Extracting Data from Sound, Homemade Pulse Oximeter–Extracting Data from Video audio from stethoscope, Homemade Digital Stethoscope–Extracting Data from Sound video from pulse oximeter, Homemade Pulse Oximeter–Extracting Data from Video heart rate, My Beating Heart, Finding the Heart Rate–Finding the Heart Rate, Generating the Heartbeat Waveform and Calculating the Heart Rate–Generating the Heartbeat Waveform and Calculating the Heart Rate finding from video file, Generating the Heartbeat Waveform and Calculating the Heart Rate–Generating the Heartbeat Waveform and Calculating the Heart Rate finding from WAV file, Finding the Heart Rate–Finding the Heart Rate health parameters for, My Beating Heart heart sounds, My Beating Heart, My Beating Heart, Homemade Digital Stethoscope, Generating the Heart Sounds Waveform health parameters for, My Beating Heart recording, Homemade Digital Stethoscope types of, My Beating Heart, Generating the Heart Sounds Waveform homemade pulse oximeter for, Homemade Pulse Oximeter–Homemade Pulse Oximeter homemade stethoscope for, Homemade Digital Stethoscope height and weight example, The R Console–Sourcing Files and the Command Line here-documents, Ruby, Strings hex editor, Extracting Data from Sound histograms, Statistical transformation, Geometric object, Money–Money Homebrew tool, Installing Ruby using your platform’s package management tool hyphen (-), Variables and Functions, Variables and Functions -> assignment operator, R, Variables and Functions <- assignment operator, R, Variables and Functions I icons used in this book, Conventions Used in This Book if expression, R, Conditionals and Loops if expression, Ruby, if and unless–if and unless Ihaka, Ross (creator of R), Introducing R ImageMagick library, Extracting Data from Video IMAP (Internet Message Access Protocol), Grab and Parse importing data, R, Importing Data–Importing data from a database inheritance, Ruby, Inheritance–Inheritance initialize method, Ruby, Classes and objects inner product, Roids–Roids installation, Installing Ruby–Installing Ruby using your platform’s package management tool, Installing Shoes–Installing Shoes, Introducing R, Installing packages–Installing packages R, Introducing R R packages, Installing packages–Installing packages Ruby, Installing Ruby–Installing Ruby using your platform’s package management tool Shoes, Installing Shoes–Installing Shoes Internet Message Access Protocol (IMAP), Grab and Parse Internet Message Format, The Emailing Habits of Enron Executives invisible hand metaphor, The Invisible Hand irb application, Running Ruby–Running Ruby J jittering, Adjustments jpeg() function, R, Basic Graphs L Landsburg, Stephen E.


pages: 1,829 words: 135,521

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython by Wes McKinney

Bear Stearns, business process, data science, Debian, duck typing, Firefox, general-purpose programming language, Google Chrome, Guido van Rossum, index card, p-value, quantitative trading / quantitative finance, random walk, recommendation engine, sentiment analysis, side project, sorting algorithm, statistical model, Two Sigma, type inference

, The Python Interpreter interrupting running code, Interrupting running code intersect1d method, Unique and Other Set Logic intersection method, set-set, Index Objects intersection_update method, set intervals of time, Time Series inv function, Linear Algebra .ipynb file extension, Running the Jupyter Notebook IPython%run command and, The Python Interpreter %run command in, The %run Command-Interrupting running code about, IPython and Jupyter advanced features, Advanced IPython Features-Profiles and Configuration bookmarking directories, Directory Bookmark System code development tips, Tips for Productive Code Development Using IPython-Overcome a fear of longer files command history in, Using the Command History-Input and Output Variables exception handling in, Exceptions in IPython executing code from clipboard, Executing Code from the Clipboard figures and subplots, Figures and Subplots interacting with operating system, Interacting with the Operating System-Directory Bookmark System keyboard shortcuts for, Terminal Keyboard Shortcuts magic commands in, About Magic Commands-About Magic Commands matplotlib integration, Matplotlib Integration object introspection, Introspection-Introspection running Jupyter notebook, Running the Jupyter Notebook-Running the Jupyter Notebook running shell, Running the IPython Shell-Running the IPython Shell shell commands in, Shell Commands and Aliases software development tools, Software Development Tools-Profiling a Function Line by Line tab completion in, Tab Completion-Tab Completion ipython command, Running the IPython Shell-Running the IPython Shell is keyword, Binary operators and comparisons is not keyword, Binary operators and comparisons isalnum method, Vectorized String Functions in pandas isalpha method, Vectorized String Functions in pandas isdecimal method, Vectorized String Functions in pandas isdigit method, Vectorized String Functions in pandas isdisjoint method, set isfinite function, Universal Functions: Fast Element-Wise Array Functions isin method, Index Objects, Unique Values, Value Counts, and Membership isinf function, Universal Functions: Fast Element-Wise Array Functions isinstance function, Dynamic references, strong types islower method, Vectorized String Functions in pandas isnan function, Universal Functions: Fast Element-Wise Array Functions isnull method, Series, Handling Missing Data isnumeric method, Vectorized String Functions in pandas issubdtype function, NumPy dtype Hierarchy issubset method, set issuperset method, set isupper method, Vectorized String Functions in pandas is_monotonic property, Index Objects is_unique property, Index Objects, Axis Indexes with Duplicate Labels, Time Series with Duplicate Indices iter function, Duck typing __iter__ magic method, Duck typing iterator protocol, Duck typing, Generators-itertools module itertools module, itertools module J jit function, Writing Fast NumPy Functions with Numba join method, String Object Methods-String Object Methods, Vectorized String Functions in pandas, Merging on Index join operations, Database-Style DataFrame Joins-Database-Style DataFrame Joins JSON (JavaScript Object Notation), JSON Data-JSON Data, 1.USA.gov Data from Bitly json method, Interacting with Web APIs Jupyter notebook%load magic function, The %run Command about, IPython and Jupyter plotting nuances, Figures and Subplots running, Running the Jupyter Notebook-Running the Jupyter Notebook jupyter notebook command, Running the Jupyter Notebook K KDE (kernel density estimate) plots, Histograms and Density Plots kernels, defined, IPython and Jupyter, Running the Jupyter Notebook key-value pairs, dict keyboard shortcuts for IPython, Terminal Keyboard Shortcuts KeyboardInterrupt exception, Interrupting running code KeyError exception, set keys method, dict keyword arguments, Function and object method calls, Functions kurt method, Summarizing and Computing Descriptive Statistics L l(ist) debugger command, Interactive Debugger labelsaxis indexes with duplicate labels, Axis Indexes with Duplicate Labels selecting in matplotlib, Ticks, Labels, and Legends-Setting the title, axis labels, ticks, and ticklabels lagging data, Shifting (Leading and Lagging) Data lambda (anonymous) functions, Anonymous (Lambda) Functions language semantics for Pythonabout, Language Semantics attributes, Attributes and methods binary operators and comparisons, Binary operators and comparisons, set comments, Comments duck typing, Duck typing function and object method calls, Function and object method calls import conventions, Imports indentation not braces, Indentation, not braces methods, Attributes and methods mutable and immutable objects, Mutable and immutable objects object model, Everything is an object references, Variables and argument passing-Dynamic references, strong types strongly typed language, Dynamic references, strong types variables and argument passing, Variables and argument passing last method, Data Aggregation leading data, Shifting (Leading and Lagging) Data left join type, Database-Style DataFrame Joins legend method, Adding legends legend selection in matplotlib, Colors, Markers, and Line Styles-Adding legends len function, Grouping with Functions len method, Vectorized String Functions in pandas less function, Universal Functions: Fast Element-Wise Array Functions less_equal function, Universal Functions: Fast Element-Wise Array Functions level keyword, Grouping by Index Levels level method, Summarizing and Computing Descriptive Statistics levelsgrouping by index levels, Grouping by Index Levels sorting, Reordering and Sorting Levels summary statistics by, Summary Statistics by Level lexsort method, Indirect Sorts: argsort and lexsort libraries (see specific libraries) line plots, Line Plots-Line Plots line style selection in matplotlib, Colors, Markers, and Line Styles linear algebra, Linear Algebra-Linear Algebra linear regression, Example: Group-Wise Linear Regression, Estimating Linear Models-Estimating Linear Models Linux, setting up Python on, GNU/Linux list comprehensions, List, Set, and Dict Comprehensions-Nested list comprehensions list function, Binary operators and comparisons, List lists (data structures)about, List adding and removing elements, Adding and removing elements combining, Concatenating and combining lists concatenating, Concatenating and combining lists maintaining sorted lists, Binary search and maintaining a sorted list slicing, Slicing sorting, Sorting lists (data structures)binary searches, Binary search and maintaining a sorted list ljust method, String Object Methods load function, File Input and Output with Arrays, Advanced Array Input and Output %load magic function, The %run Command loads function, JSON Data loc operator, DataFrame, Selection with loc and iloc, Adding legends, Interfacing Between pandas and Model Code local namespace, Namespaces, Scope, and Local Functions, Getting Started with pandas localizing data to time zones, Time Zone Localization and Conversion log function, Universal Functions: Fast Element-Wise Array Functions log10 function, Universal Functions: Fast Element-Wise Array Functions log1p function, Universal Functions: Fast Element-Wise Array Functions log2 function, Universal Functions: Fast Element-Wise Array Functions logical_and function, Universal Functions: Fast Element-Wise Array Functions, ufunc Instance Methods logical_not function, Universal Functions: Fast Element-Wise Array Functions logical_or function, Universal Functions: Fast Element-Wise Array Functions logical_xor function, Universal Functions: Fast Element-Wise Array Functions LogisticRegression class, Introduction to scikit-learn LogisticRegressionCV class, Introduction to scikit-learn long format, Pivoting “Long” to “Wide” Format lower method, Transforming Data Using a Function or Mapping, String Object Methods, Vectorized String Functions in pandas %lprun magic function, Profiling a Function Line by Line lstrip method, String Object Methods, Vectorized String Functions in pandas lstsq function, Linear Algebra lxml library, XML and HTML: Web Scraping-Parsing XML with lxml.objectify M %m datetime format, Dates and times, Converting Between String and Datetime %M datetime format, Dates and times, Converting Between String and Datetime mad method, Summarizing and Computing Descriptive Statistics magic functions, About Magic Commands-About Magic Commands(see also specific magic functions) %debug magic function, About Magic Commands %magic magic function, About Magic Commands many-to-many merge, Database-Style DataFrame Joins many-to-one join, Database-Style DataFrame Joins map built-in function, List, Set, and Dict Comprehensions, Functions Are Objects map method, Function Application and Mapping, Transforming Data Using a Function or Mapping, Renaming Axis Indexes mappingtransforming data using, Transforming Data Using a Function or Mapping universal functions, Function Application and Mapping-Sorting and Ranking margins method, Pivot Tables and Cross-Tabulation margins, defined, Pivot Tables and Cross-Tabulation marker selection in matplotlib, Colors, Markers, and Line Styles match method, Unique Values, Value Counts, and Membership, Regular Expressions, Regular Expressions, Vectorized String Functions in pandas Math Kernel Library (MKL), Linear Algebra matplotlib libraryabout, matplotlib, Plotting and Visualization annotations in, Annotations and Drawing on a Subplot-Annotations and Drawing on a Subplot color selection in, Colors, Markers, and Line Styles configuring, matplotlib Configuration creating image plots, Array-Oriented Programming with Arrays figures in, Figures and Subplots-Adjusting the spacing around subplots import convention, A Brief matplotlib API Primer integration with IPython, Matplotlib Integration label selection in, Ticks, Labels, and Legends-Setting the title, axis labels, ticks, and ticklabels legend selection in, Colors, Markers, and Line Styles-Adding legends line style selection in, Colors, Markers, and Line Styles marker selection in, Colors, Markers, and Line Styles saving plots to files, Saving Plots to File subplots in, Figures and Subplots-Adjusting the spacing around subplots, Annotations and Drawing on a Subplot-Annotations and Drawing on a Subplot tick mark selection in, Ticks, Labels, and Legends-Setting the title, axis labels, ticks, and ticklabels %matplotlib magic function, Matplotlib Integration, Interacting with the Operating System matrix operations in NumPy, Transposing Arrays and Swapping Axes, Linear Algebra max method, Mathematical and Statistical Methods, Sorting and Ranking, Summarizing and Computing Descriptive Statistics, Data Aggregation maximum function, Universal Functions: Fast Element-Wise Array Functions mean method, Mathematical and Statistical Methods, Summarizing and Computing Descriptive Statistics, GroupBy Mechanics, Data Aggregation median method, Summarizing and Computing Descriptive Statistics, Data Aggregation melt method, Pivoting “Wide” to “Long” Format memmap object, Memory-Mapped Files memory managementC versus Fortran order, C Versus Fortran Order continguous memory, The Importance of Contiguous Memory-The Importance of Contiguous Memory NumPy-based algorithms and, NumPy Basics: Arrays and Vectorized Computation memory-mapped files, Memory-Mapped Files merge function, Database-Style DataFrame Joins-Database-Style DataFrame Joins mergesort method, Alternative Sort Algorithms merging datacombining data with overlap, Combining Data with Overlap concatenating along an axis, Concatenating Along an Axis-Concatenating Along an Axis database-stye DataFrame joins, Database-Style DataFrame Joins-Database-Style DataFrame Joins merging on index, Merging on Index-Merging on Index meshgrid function, Array-Oriented Programming with Arrays methodscategorical, Categorical Methods-Categorical Methods chaining, Techniques for Method Chaining-The pipe Method defined, Function and object method calls for boolean arrays, Methods for Boolean Arrays for strings, String Object Methods-String Object Methods for summary statistics, Unique Values, Value Counts, and Membership-Unique Values, Value Counts, and Membership for tuples, Tuple methods hidden, Tab Completion in Python, Function and object method calls, Attributes and methods object introspection, Introspection optimized for GroupBy, Data Aggregation statistical, Mathematical and Statistical Methods-Mathematical and Statistical Methods ufunc instance methods, ufunc Instance Methods-ufunc Instance Methods vectorized string methods in pandas, Vectorized String Functions in pandas-Vectorized String Functions in pandas Microsoft Excel files, Reading Microsoft Excel Files-Reading Microsoft Excel Files min method, Mathematical and Statistical Methods, Sorting and Ranking, Summarizing and Computing Descriptive Statistics, Data Aggregation minimum function, Universal Functions: Fast Element-Wise Array Functions missing dataabout, Handling Missing Data filling in, Filling In Missing Data-Filling In Missing Data, Replacing Values filling with group-specific values, Example: Filling Missing Values with Group-Specific Values filtering out, Filtering Out Missing Data marked by sentinel values, Reading and Writing Data in Text Format, Handling Missing Data sorting considerations, Sorting and Ranking mixture-of-normals estimate, Histograms and Density Plots MKL (Math Kernel Library), Linear Algebra mod function, Universal Functions: Fast Element-Wise Array Functions modf function, Universal Functions: Fast Element-Wise Array Functions-Universal Functions: Fast Element-Wise Array Functions modulesimport conventions for, Import Conventions, Imports reloading dependencies, Reloading Module Dependencies MovieLens 1M dataset example, MovieLens 1M Dataset-Measuring Rating Disagreement moving window functionsabout, Moving Window Functions-Moving Window Functions binary, Binary Moving Window Functions exponentially-weighted functions, Exponentially Weighted Functions user-defined, User-Defined Moving Window Functions mro method, NumPy dtype Hierarchy MSFT attribute, Correlation and Covariance mul method, Arithmetic methods with fill values multiply function, Universal Functions: Fast Element-Wise Array Functions munging (see data wrangling) mutable objects, Mutable and immutable objects N n(ext) debugger command, Interactive Debugger NA data type, Handling Missing Data name attribute, Series, DataFrame names attribute, Boolean Indexing, Structured and Record Arrays namespacesempty, The %run Command functions and, Namespaces, Scope, and Local Functions in Python, Dynamic references, strong types NumPy, The NumPy ndarray: A Multidimensional Array Object NaN (Not a Number), Universal Functions: Fast Element-Wise Array Functions, Series, Handling Missing Data NaT (Not a Time), Converting Between String and Datetime ndarray objectabout, NumPy Basics: Arrays and Vectorized Computation, The NumPy ndarray: A Multidimensional Array Object-The NumPy ndarray: A Multidimensional Array Object advanced input and output, Advanced Array Input and Output-HDF5 and Other Array Storage Options arithmetic with, Arithmetic with NumPy Arrays array-oriented programming, Array-Oriented Programming with Arrays-Unique and Other Set Logic as structured arrays, Structured and Record Arrays-Why Use Structured Arrays?

<Press Tab> a.capitalize a.format a.isupper a.rindex a.strip a.center a.index a.join a.rjust a.swapcase a.count a.isalnum a.ljust a.rpartition a.title a.decode a.isalpha a.lower a.rsplit a.translate a.encode a.isdigit a.lstrip a.rstrip a.upper a.endswith a.islower a.partition a.split a.zfill a.expandtabs a.isspace a.replace a.splitlines a.find a.istitle a.rfind a.startswith Attributes and methods can also be accessed by name via the getattr function: In [27]: getattr(a, 'split') Out[27]: <function str.split> In other languages, accessing objects by name is often referred to as “reflection.” While we will not extensively use the functions getattr and related functions hasattr and setattr in this book, they can be used very effectively to write generic, reusable code. Duck typing Often you may not care about the type of an object but rather only whether it has certain methods or behavior. This is sometimes called “duck typing,” after the saying “If it walks like a duck and quacks like a duck, then it’s a duck.” For example, you can verify that an object is iterable if it implemented the iterator protocol. For many objects, this means it has a __iter__ “magic method,” though an alternative and better way to check is to try using the iter function: def isiterable(obj): try: iter(obj) return True except TypeError: # not iterable return False This function would return True for strings as well as most Python collection types: In [29]: isiterable('a string') Out[29]: True In [30]: isiterable([1, 2, 3]) Out[30]: True In [31]: isiterable(5) Out[31]: False A place where I use this functionality all the time is to write functions that can accept multiple kinds of input.

US baby names dataset example, US Baby Names 1880–2010-Boy names that became girl names (and vice versa) US Federal Election Commission database example, 2012 Federal Election Commission Database-Donation Statistics by State USA.gov data from Bitly example, 1.USA.gov Data from Bitly-Counting Time Zones with pandas USDA food database example, USDA Food Database-USDA Food Database “two-language” problem, Solving the “Two-Language” Problem data cleaning and preparation (see data wrangling) data loading (see reading data) data manipulation (see data wrangling) data munging (see data wrangling) data selectionfor axis indexes with duplicate labels, Axis Indexes with Duplicate Labels in pandas library, Indexing, Selection, and Filtering-Selection with loc and iloc time series data, Indexing, Selection, Subsetting data structuresabout, Data Structures and Sequences dict comprehensions, List, Set, and Dict Comprehensions dicts, dict-Valid dict key types for pandas library, Introduction to pandas Data Structures-Index Objects list comprehensions, List, Set, and Dict Comprehensions-Nested list comprehensions lists, List-Slicing set comprehensions, List, Set, and Dict Comprehensions sets, set-set tuples, Tuple-Tuple methods data transformation (see transforming data) data typesattributes for, Structured and Record Arrays defined, Data Types for ndarrays, ndarray Object Internals for date and time data, Date and Time Data Types and Tools for ndarrays, Data Types for ndarrays-Data Types for ndarrays in Python, Scalar Types-Dates and times nested, Nested dtypes and Multidimensional Fields NumPy hierarchy, NumPy dtype Hierarchy parent classes of, NumPy dtype Hierarchy data wranglingcombining and merging datasets, Combining and Merging Datasets-Combining Data with Overlap defined, Jargon handling missing data, Handling Missing Data-Filling In Missing Data hierarchical indexing, Hierarchical Indexing-Indexing with a DataFrame’s columns, Reshaping with Hierarchical Indexing pivoting data, Pivoting “Long” to “Wide” Format-Pivoting “Wide” to “Long” Format reshaping data, Reshaping with Hierarchical Indexing string manipulation, String Manipulation-Vectorized String Functions in pandas transforming data, Data Transformation-Computing Indicator/Dummy Variables working with delimited formats, Working with Delimited Formats-Working with Delimited Formats databasesDataFrame joins, Database-Style DataFrame Joins-Database-Style DataFrame Joins pandas interacting with, Interacting with Databases storing data in, Pivoting “Long” to “Wide” Format DataFrame data structureabout, pandas, DataFrame-DataFrame, Nested dtypes and Multidimensional Fields database-stye joins, Database-Style DataFrame Joins-Database-Style DataFrame Joins indexing with columns, Indexing with a DataFrame’s columns JSON data and, JSON Data operations between Series and, Operations between DataFrame and Series optional function arguments, Reading and Writing Data in Text Format plot method arguments, Line Plots possible data inputs to, DataFrame ranking data in, Sorting and Ranking sorting considerations, Sorting and Ranking, Indirect Sorts: argsort and lexsort summary statistics methods for, Correlation and Covariance DataOffset object, Operations with Time Zone−Aware Timestamp Objects datasetscombining and merging, Combining and Merging Datasets-Combining Data with Overlap MovieLens 1M example, MovieLens 1M Dataset-Measuring Rating Disagreement US baby names example, US Baby Names 1880–2010-Boy names that became girl names (and vice versa) US Federal Election Commission database example, 2012 Federal Election Commission Database-Donation Statistics by State USA.gov data from Bitly example, 1.USA.gov Data from Bitly-Counting Time Zones with pandas USDA food database example, USDA Food Database-USDA Food Database date data type, Dates and times, Date and Time Data Types and Tools date offsets, Frequencies and Date Offsets, Shifting dates with offsets-Shifting dates with offsets date ranges, generating, Generating Date Ranges-Generating Date Ranges dates and timesabout, Dates and times converting between strings and datetime, Converting Between String and Datetime-Converting Between String and Datetime data types and tools, Date and Time Data Types and Tools formatting specifications, Converting Between String and Datetime, Converting Between String and Datetime generating date ranges, Generating Date Ranges-Generating Date Ranges period arithmetic and, Periods and Period Arithmetic-Creating a PeriodIndex from Arrays datetime data typeabout, Dates and times, Date and Time Data Types and Tools-Date and Time Data Types and Tools converting between strings and, Converting Between String and Datetime-Converting Between String and Datetime format specification for, Converting Between String and Datetime datetime module, Dates and times, Date and Time Data Types and Tools datetime64 data type, Time Series Basics DatetimeIndex class, Time Series Basics, Generating Date Ranges, Time Zone Localization and Conversion dateutil package, Converting Between String and Datetime date_range function, Generating Date Ranges-Generating Date Ranges daylight saving time (DST), Time Zone Handling debug function, Other ways to make use of the debugger %debug magic function, Exceptions in IPython, Interactive Debugger debugger, IPython, Interactive Debugger-Other ways to make use of the debugger decode method, Bytes and Unicode def keyword, Functions, Anonymous (Lambda) Functions default values for dicts, Default values defaultdict class, Default values del keyword, dict, DataFrame del method, DataFrame delete method, Index Objects delimited formats, working with, Working with Delimited Formats-Working with Delimited Formats dense method, Sorting and Ranking density plots, Histograms and Density Plots-Histograms and Density Plots deque (double-ended queue), Adding and removing elements describe method, Summarizing and Computing Descriptive Statistics, Data Aggregation design matrix, Creating Model Descriptions with Patsy det function, Linear Algebra development tools for IPython (see software development tools for IPython) %dhist magic function, Interacting with the Operating System diag function, Linear Algebra Dialect class, Working with Delimited Formats dict comprehensions, List, Set, and Dict Comprehensions dict function, Creating dicts from sequences dictionary-encoded representation, Background and Motivation dicts (data structures)about, dict creating from sequences, Creating dicts from sequences DataFrame data structure as, DataFrame default values, Default values grouping with, Grouping with Dicts and Series Series data structure as, Series valid key types, Valid dict key types diff method, Summarizing and Computing Descriptive Statistics difference method, set, Index Objects difference_update method, set dimension tables, Background and Motivation directories, bookmarking in IPython, Directory Bookmark System %dirs magic function, Interacting with the Operating System discretization, Discretization and Binning distplot method, Histograms and Density Plots div method, Arithmetic methods with fill values divide function, Universal Functions: Fast Element-Wise Array Functions divmod function, Universal Functions: Fast Element-Wise Array Functions dmatrices function, Creating Model Descriptions with Patsy dnorm function, Estimating Linear Models dot function, Transposing Arrays and Swapping Axes, Linear Algebra-Linear Algebra downsampling, Resampling and Frequency Conversion, Downsampling-Open-High-Low-Close (OHLC) resampling dreload function, Reloading Module Dependencies drop method, Index Objects, Dropping Entries from an Axis dropna method, Handling Missing Data-Filtering Out Missing Data, Example: Filling Missing Values with Group-Specific Values, Pivot Tables and Cross-Tabulation drop_duplicates method, Removing Duplicates DST (daylight saving time), Time Zone Handling dstack function, Concatenating and Splitting Arrays dtype (see data types) dtype attribute, The NumPy ndarray: A Multidimensional Array Object, Data Types for ndarrays duck typing, Duck typing dummy variables, Computing Indicator/Dummy Variables-Computing Indicator/Dummy Variables, Creating dummy variables for modeling, Interfacing Between pandas and Model Code, Categorical Data and Patsy dumps function, JSON Data duplicate dataaxis indexes with duplicate labels, Axis Indexes with Duplicate Labels removing, Removing Duplicates time series with duplicate indexes, Time Series with Duplicate Indices duplicated method, Removing Duplicates dynamic references in Python, Dynamic references, strong types E edit-compile-run workflow, IPython and Jupyter education, continuing, Continuing Your Education eig function, Linear Algebra elif statement, if, elif, and else else statement, if, elif, and else empty function, Creating ndarrays-Creating ndarrays empty namespace, The %run Command empty_like function, Creating ndarrays encode method, Bytes and Unicode end-of-line (EOL) markers, Files and the Operating System endswith method, String Object Methods, Vectorized String Functions in pandas enumerate function, enumerate %env magic function, Interacting with the Operating System EOL (end-of-line) markers, Files and the Operating System equal function, Universal Functions: Fast Element-Wise Array Functions error handling in Python, Errors and Exception Handling-Exceptions in IPython escape characters, Strings ewm function, Exponentially Weighted Functions Excel files (Microsoft), Reading Microsoft Excel Files-Reading Microsoft Excel Files ExcelFile class, Reading Microsoft Excel Files exception handling in Python, Errors and Exception Handling-Exceptions in IPython exclamation point (!)


pages: 680 words: 157,865

Beautiful Architecture: Leading Thinkers Reveal the Hidden Beauty in Software Design by Diomidis Spinellis, Georgios Gousios

Albert Einstein, barriers to entry, business intelligence, business logic, business process, call centre, continuous integration, corporate governance, database schema, Debian, domain-specific language, don't repeat yourself, Donald Knuth, duck typing, en.wikipedia.org, fail fast, fault tolerance, financial engineering, Firefox, Free Software Foundation, functional programming, general-purpose programming language, higher-order functions, iterative process, linked data, locality of reference, loose coupling, meta-analysis, MVC pattern, Neal Stephenson, no silver bullet, peer-to-peer, premature optimization, recommendation engine, Richard Stallman, Ruby on Rails, semantic web, smart cities, social graph, social web, SPARQL, Steve Jobs, Stewart Brand, Strategic Defense Initiative, systems thinking, the Cathedral and the Bazaar, traveling salesman, Turing complete, type inference, web application, zero-coupon bond

Latent typing has been popularized recently thanks to its widespread adoption in the Ruby programming language. The term “duck typing” is a tongue-in-cheek reference to inductive reasoning, attributed to James Whitcomb Riley, which goes: If it walks like a duck and quacks like a duck, I would call it a duck. To see the importance of duck typing, take an essential feature of object-oriented programming, polymorphism. Polymorphism stands for the use of different types in the same context. One way to achieve polymorphism is through inheritance. A subclass can be used (more precisely, should be used, because programmers can be careless) wherever a superclass can be used. Duck typing offers an additional way to achieve polymorphism: a type can be used anywhere it offers methods fitting the context.

Types Are Defined Implicitly Although everything in Smalltalk, even classes, is an object, classes do not correspond to types in the way they do in languages such as C++ and Java. Types are defined implicitly by what they do, and by their interfaces. This is described by names such as latent typing or duck typing. Latent typing is the only typing mechanism in Smalltalk (and also in some other dynamically typed languages), but that does not mean it is of no importance to strongly typed languages. In C++, for instance, latent typing is the basis of generic programming via templates. It makes sense to see it first in that language.

Duck typing offers an additional way to achieve polymorphism: a type can be used anywhere it offers methods fitting the context. In the pet and robot example shown earlier in Python and C++, Dog and Robot do not share a superclass. Of course it is possible to program your way around duck typing only the inheritance type of polymorphism. A programmer, however, is wealthier if she has more tools at her disposal for solving the problem she faces. As long as the plurality of tools does not get into the way, she can choose among them as best fits the situation. This has been expressed very elegantly by Bjarne Stroustrup in The Design and Evolution of C++ (1994, p. 23): My interest in computers and programming languages is fundamentally pragmatic.


Scala in Action by Nilanjan Raychaudhuri

business logic, continuous integration, create, read, update, delete, database schema, domain-specific language, don't repeat yourself, duck typing, en.wikipedia.org, failed state, fault tolerance, functional programming, general-purpose programming language, higher-order functions, index card, Kanban, MVC pattern, type inference, web application

Table 10.2 shows the list. Table 10.2. Techniques to implement dependency injection Technique Description Cake pattern Handles dependency using trait mixins and abstract members. Structural typing Uses structural typing to manage dependencies. The Scala structural typing feature provides duck typing[a] in a type-safe manner. Duck typing is a style of dynamic typing in which the object’s current behavior is determined by the methods and properties currently associated with the object. Implicit parameters Manages dependencies using implicit parameters so that as a caller you don’t have to pass them. In this case, dependencies could be easily controlled using scope.

Function currying is a technique by which you can transform a function with multiple arguments into multiple functions that take a single argument and chain them together. Using a DI framework Most of the techniques mentioned here will be home-grown. I show you how to use a DI framework in your Scala project. a Duck typing, http://en.wikipedia.org/wiki/Duck_typing. These techniques can help to write more testable code and provide a scalable solution in Scala. Let’s take our favorite CalculatePriceService and apply each of the techniques mentioned in the table. 10.4.2. Cake pattern A cake pattern[13] is a technique to build multiple layers of indirection in your application to help with managing dependencies.

This section explores various types offered by the Scala type system. 8.2.1. Structural types Structural typing in Scala is the way to describe types by their structure, not by their names, as with other typing. If you’ve used dynamically typed languages, a structural type may give you the feel of duck typing (a style of dynamic typing) in a type-safe manner. Let’s say you want to close any resource after use as long as it’s closable. One way to do that would be to define a trait that declares a close method and have all the resources implement the trait. But using a structural type, you can easily define a new type by specifying its structure, like the following: def close(closable: { def close: Unit }) = { closable.close } The type of the parameter is defined by the { def close: Unit } structure.


pages: 643 words: 53,639

Rapid GUI Programming With Python and Qt by Mark Summerfield

Debian, duck typing, Guido van Rossum, loose coupling, MVC pattern, software patent, sorting algorithm, web application

For example: a = Painting("Cecil Collins", "The Sleeping Fool", 1943) print a # Prints "The Sleeping Fool by Cecil Collins in 1943" b = Sculpture("Auguste Rodin", "The Secret", 1925, "bronze") print b # Prints "The Secret by Auguste Rodin in 1925 (bronze)" Although we have shown polymorphism using a special method, it works exactly the same for ordinary methods. Python uses dynamic typing, also called duck typing (“If it walks like a duck and it quacks like a duck, it is a duck”). This is very flexible. For example, suppose we had a class like this: class Title(object): def __init__(self, title) self.__title = title def title(self): return self.__title Now we could do this: items = [] items.append(Painting("Cecil Collins", "The Poet", 1941)) items.append(Sculpture("Auguste Rodin", "Naked Balzac", 1917, "plaster")) items.append(Title("Eternal Springtime")) for item in items: print item.title() This will print the title of each item, even though the items are of different types.

We also made use of much of the knowledge gained from the previous chapters, including some of Python’s advanced features, such as list comprehensions and generator methods. This chapter also showned how to do both single and multiple inheritance, and gave an example of how to create a simple interface class. We learned more about using isinstance() for type testing, and about hasattr() and duck typing. We concluded the chapter with an overview of how Python modules and multifile applications work. We also looked at the doctest module and saw how easy it is to create unit tests that look like examples in our docstrings. We now know the Python language fundamentals. We can create variables, use collections, and create our own data types and collection types.

C:\pyqt\chap07>python findandreplacedlg.py Unless using automated testing tools, it is often helpful to add testing functionality to dialogs. It does not take too much time or effort to write them, and running them whenever a change is made to the dialog’s logic will help minimize the introduction of bugs. Sometimes we pass complex objects to dialogs that may appear to make testing impossible. But thanks to Python’s duck typing we can always create a fake class that simulates enough behavior to be usable for testing. For example, in Chapter 12, we use a property editor dialog. This dialog operates on “Node” objects, so in the testing code we create a FakeNode class that provides the methods for setting and getting a node’s properties that the dialog makes use of.


Python Web Development With Django by Jeff Forcier

business logic, create, read, update, delete, database schema, Debian, don't repeat yourself, duck typing, en.wikipedia.org, Firefox, full text search, functional programming, Guido van Rossum, loose coupling, MVC pattern, revision control, Ruby on Rails, Silicon Valley, slashdot, SQL injection, web application

Because you can remap variable names like this, you are never really 100 percent sure what type of object a variable is pointing to at any given time, unless you ask the interpreter for more information. However, as long as a given variable behaves like a certain type (for example, if it has all the methods a string normally has), it can be considered to be of that type, even if it has extra attributes.This is referred to as “duck-typing”—if it waddles like a duck and quacks like a duck, then we can treat it as a duck. Operators As far as operators in general go, Python supports pretty much the same ones you’re used to from other programming languages.These include arithmetic operators, such as +, -, and *, and so on, and this includes their corresponding augmented assignment operators, +=, -=, *=, and so forth.This just means instead of x = x + 1, you can use x += 1.Absent are the increment/decrement operators (++ and --) you may have used in other languages.

QuerySets can be thought of as simply lists of model class instances (or database rows/records), but they’re much more powerful than that. Managers provide a Using Models jumping-off point for generating queries, but QuerySets are where most of the action really happens. QuerySets are multifaceted objects, making good use of Python’s dynamic nature, flexibility, and so-called “duck typing” to provide a trio of important and powerful behaviors; they are database queries, containers, and building blocks all rolled into one. QuerySet as Database Query As evidenced by the name, a QuerySet can be thought of as a nascent database query. It can be translated into a string of SQL to be executed on the database.

Displaying Forms Piecemeal In addition to the convenience methods outlined previously, it’s possible to exert finer control over how your form is arranged.The individual Field objects are available through dictionary keys on the form itself, enabling you to display them whenever and wherever you want.You can also iterate over the form itself, thanks to Python’s duck typing capabilities. Regardless of how you obtain them, each field has its own errors attribute, a list-like object whose string representation is the same unordered list previously displayed in the wholesale methods (and overridden the same way). In oldforms, the simplest way to override the default HTML representation of a form field was to access the field’s data attribute and wrap it with custom HTML—a trick that is still possible with newforms.


pages: 448 words: 71,301

Programming Scala by Unknown

billion-dollar mistake, business logic, domain-specific language, duck typing, en.wikipedia.org, fault tolerance, functional programming, general-purpose programming language, higher-order functions, information security, loose coupling, type inference, web application

We will return to self-type annotations as a component composition model in Chapter 13. See “Self-Type Annotations and Abstract Type Members” on page 317 and “Dependency Injection in Scala: The Cake Pattern” on page 334. Structural Types You can think of structural types as a type-safe approach to duck typing, the popular name for the way method resolution works in dynamically typed languages. In Ruby, for example, when you write starFighter.shootWeapons, the runtime looks for a shootWeapons method on the object referenced by starFighter. That method, if found, might have been defined in the class used to instantiate starFighter or one of its parents or “included” modules.

An internal DSL is an idiomatic form of a general-purpose programming language. That is, no specialpurpose parser is created for the language. Instead, DSL code is written in the generalpurpose language and parsed just like any other code. An external DSL is a language with its own grammar and parser. Duck Typing A term used in languages with dynamic typing for the way method resolution works. As long as an object accepts a method call (message send), the runtime is satisfied. “If it walks like a duck and talks like a duck, it’s a duck.” Contrast with the use of structural types in some statically typed languages like Scala.

Structural Type A structural type is like an anonymous type, where only the “structure” a candidate type must support is specified, such as members that must be present. Structural types do not name the candidate types that can match, nor do any matching types need to share a common parent trait or class with the structural type. Hence, structural types are a type-safe analog to duck typing in dynamically typed languages, like Ruby. Subtype A synonym for derived type. Supertype A synonym for parent type. Symbol An interned string. Literal symbols are written starting with a single “right quote,” e.g., 'name. Tail-Call Recursion A form of recursion where a function calls itself as the last thing it does, i.e., it does no additional computations with the result of the recursive call.


pages: 263 words: 20,730

Exploring Python by Timothy Budd

c2.com, centre right, duck typing, functional programming, general-purpose programming language, Guido van Rossum, higher-order functions, index card, random walk, sorting algorithm, web application

In an abstract sense, we have created a “type” that represents the concept of the stack, and both classes implement this type. However, this is only an abstract characterization, and there is no actual entity that represents this type. This idea is often termed duck typing, Exploring Python – Chapter 9: Object-Oriented Programming 9 after the folk saying: “if it walks like a duck, and talks like a duck, then it probably is a duck”. Duck typing is found in dynamically typed object-oriented languages, such as Python. (Another programming language with similar features is Smalltalk). Strongly typed languages, such as Java and C++, use an entirely different approach to typing.

Walk through the scenarios, and for each action (verb) make sure you have identified an agent (represented by a CRC card) responsible for performing the action. Exploring Python – Chapter 9: Object-Oriented Programming 8 Once you are satisfied that you have captured all the actions necessary to make your application work as you expect, the CRC cards can then be used as a basis for coding the classes in your application. Duck Typing In the object-oriented paradigm, classes are characterized by the services they provide. A consequence of this view is that any value can be replaced by another value, as long as the replacement provides the same service, that is, the same actions using the same names. This is true even if the implementation of the service is radically different.


pages: 59 words: 12,801

The Little Book on CoffeeScript by Alex MacCaw

duck typing, Firefox, MVC pattern, node package manager, web application, Y2K

Here’s an example implementation ported from jQuery’s $.type: type = do -> classToType = {} types = [ "Boolean" "Number" "String" "Function" "Array" "Date" "RegExp" "Undefined" "Null" ] for name in types classToType["[object #{name}]"] = name.toLowerCase() (obj) -> strType = Object::toString.call(obj) classToType[strType] or "object" # Returns the sort of types we'd expect: type("") # "string" type(new String) # "string" type([]) # "array" type(/\d/) # "regexp" type(new Date) # "date" type(true) # "boolean" type(null) # "null" type({}) # "object" If you’re checking to see if a variable has been defined, you’ll still need to use typeof; otherwise, you’ll get a ReferenceError: if typeof aVar isnt "undefined" objectType = type(aVar) Or more succinctly with the existential operator: objectType = type(aVar?) As an alternative to type checking, you can often use duck typing and the CoffeeScript existential operator together, which eliminates the need to resolve an object’s type. For example, let’s say we’re pushing a value onto an array. We could say that, as long as the “array like” object implements push(), we should treat it like an array: anArray?.push? aValue If anArray is an object other than an array, then the existential operator will ensure that push() is never called.


pages: 629 words: 83,362

Programming TypeScript by Boris Cherny

billion-dollar mistake, bitcoin, business logic, database schema, don't repeat yourself, duck typing, Firefox, functional programming, Internet of things, pull request, source of truth, SQL injection, type inference, web application

This is by design: JavaScript is generally structurally typed, so TypeScript favors that style of programming over a nominally typed style. Structural typing A style of programming where you just care that an object has certain properties, and not what its name is (nominal typing). Also called duck typing in some languages (or, not judging a book by its cover). There are a few ways to use types to describe objects in TypeScript. The first is to declare a value as an object: let a: object = { b: 'x' } What happens when you access b? a.b // Error TS2339: Property 'b' does not exist on type 'object'.

declarationMap (compiler option), Project References declare keyword, Type Declarations, Type Declarations declare module syntax, Ambient Module Declarations decorators, Decorators-Decoratorsexample, @serializable decorator, Decorators expected type signatures for decorator functions, Decorators use by Angular, Components defaultsfor function parameters, Optional and Default Parameters generic type, Generic Type Defaults definite assignment assertions, Definite Assignment Assertions DefinitelyTypedJavaScript with type declarations on, JavaScript That Has Type Declarations on DefinitelyTyped JavaScript without type declarations on, JavaScript That Doesn’t Have Type Declarations on DefinitelyTyped dependency injector (Angular), Services design patterns, Design Patterns-Builder Patternbuilder pattern, Builder Pattern factory pattern, Factory Pattern dialog, closing with nonnull assertions, Nonnull Assertions directivestriple-slash, Triple-Slash Directives-The amd-module Directive, Triple-Slash Directivesamd-module, The amd-module Directive types, The types Directive distributive property of conditional types following, Distributive Conditionals DOM (Document Object Model), PrefaceDOM type declarations, enabling, lib enabling DOM APIs in TSC compiler, Frontend Frameworks events, typed in TypeScript standard library, Event Emitters overloads in browser DOM APIs, Overloaded Function Types dom library for TSC, In the Browser: With Web Workers Don't Repeat Yourself (DRY), Type aliases downlevelIteration TSC flag, Iterators DRY (Don't Repeat Yourself), Type aliases duck typing, Objects(see also structural typing) dynamic imports, Dynamic Imports E editors, When are types checked?autocompletion in, Frontend Frameworks Either type, The Option Type engines (JavaScript), The Compiler enums, Enums-Summary, Simulating Nominal Typesassignability, Assignability avoiding use of, Enums const, Enums naming conventions, Enums preserveConstEnums TSC flag, Enums splitting across multiple declarations, Enums with explicit numeric values, example of, Enums with inferred numeric values, example of, Enums with partially inferred values, example of, Enums errorserror messages in TypeScript, Introduction handling, Handling Errors-Exercisesin Promises, Regaining Sanity with Promises returning exceptions, Returning Exceptions-Returning Exceptions returning null, Returning null throwing exceptions, Throwing Exceptions-Throwing Exceptions using Option type, The Option Type-The Option Type in callbacks, Working with Callbacks in JavaScript, Introduction in Promise implementation, Regaining Sanity with Promises monitoring in TypeScript projects, Error Monitoring surfacing, in JavaScript versus TypeScript, When are errors surfaced?


pages: 706 words: 120,784

The Joy of Clojure by Michael Fogus, Chris Houser

cloud computing, Dennis Ritchie, domain-specific language, Donald Knuth, Douglas Hofstadter, duck typing, en.wikipedia.org, finite state, functional programming, Gödel, Escher, Bach, haute couture, higher-order functions, Larry Wall, Paul Graham, rolodex, SQL injection, traveling salesman

The object constructed here is some kind of map, but it isn’t, as far as Clojure is concerned, a TreeNode. That means that when used in its simple form as we did here, there’s no clean way[2] to determine whether any particular map is a TreeNode or not. 2 You could test a map for the existence of the keys :val, :l, and :r, a sort of duck-typing but on fields instead of methods. But because there exists a real possibility than some other kind of object may happen to have these keys but use them in a different way, undesirable complexity and/or unexpected behavior is likely. Fortunately, you can mitigate this risk by using namespace-qualified keywords.

, 2nd, 3rd continuation-passing style (CPS), 4th, 5th accept function return continuation continue, 2nd contrib, 2nd control structures, 2nd, 3rd, 4th conversion specification coordination, 2nd copies, defensive counted counter-optimizations count-tweet-text-task, 2nd create-ns crypto D data structures, immutable database deadlock, 2nd, 3rd, 4th, 5th deterministic debug debug console debugging, 2nd, 3rd debug-repl decimal declarative, 2nd declare def, 2nd, 3rd, 4th, 5th, 6th default-handler defformula definterface defmacro, 2nd, 3rd defmulti, 2nd defn, 2nd, 3rd, 4th, 5th, 6th defonce defprotocol defrecord, 2nd defstruct, 2nd downfall of defstructs defunits-of def-watched delay, 2nd, 3rd, 4th, 5th delegate, 2nd dependency injection (DI) deref, 2nd, 3rd, 4th derive descendants Design Patterns: Elements of Reusable Object-Oriented Software destructuring, 2nd, 9th, 10th, 11th, 12th associative destructuring in function parameters nested versus accessor methods with a map with a vector determinacy directive, 2nd directory structure disappear dispatch display dissoc, 2nd, 3rd distributive do, 2nd, 3rd doall documentation, viewing domain domain-specific language (DSL), 2nd, 3rd, 4th, 5th, 6th, 7th, 10th, 11th domain expertise putting parentheses around the specification unit conversion DSL don’t panic doseq, 2nd, 3rd dosync, 2nd dothreads!, 2nd doto double, 2nd double-array double-backslash double-quotes doubles, 2nd do-until drawing duck typing dummy write durability DynaFrame.clj dynamic scope, 2nd, 3rd dynamic type systems, 2nd E elegance, 2nd embedding empowerment empty sequence empty? encapsulation, 5th block-level encapsulation local encapsulation namespace encapsulation Enlive enumeration values enumerator env ephemeral equality, 2nd, 6th, 7th equality partitions, 2nd, 3rd equality semantics error handling, 2nd, 3rd escaped evaluation contextual-eval, 2nd eval, 2nd meta-circular exceptions, 5th, 9th, 10th, 20th exceptions exceptions exceptions exceptions catch, 2nd checked compile-time, 2nd ConcurrentModification-Exception finally, 2nd handling java.lang.ClassCastException java.lang.Exception java.lang.NullPointer-Exception java.lang.RuntimeException runtime runtime vs. compile-time throw, 2nd, 3rd expand-clause expansion expected case experimentation expression problem extend, 2nd, 3rd, 4th extend-protocol, 2nd extend-type, 2nd Extensible Markup Language (XML), 2nd, 3rd F Factor (programming language), 2nd factory methods fail, 2nd false, 3rd evil-false Fantom (programming language) fence post errors filter, 2nd, 3rd, 4th, 5th find-doc find-ns finite state machines first, 2nd, 3rd, 4th, 5th First In, First Out (FIFO), 2nd First In, Last Out (FILO) first-class, 2nd, 3rd fixed-size pool FIXO, 3rd, 5th fixo-peek fixo-push, 2nd, 3rd flexibility float, 2nd floating point, 2nd, 5th overflow rounding error underflow, 2nd floats fluent builder FluentMove fn, 2nd, 3rd, 4th, 5th, 6th for, 2nd force, 2nd, 3rd forever form free variables freedom to focus frequencies Frink (programming language), 2nd frustrating fully qualified, 2nd, 3rd fun functions, 6th anonymous, 2nd, 3rd, 4th arity Calling Functions dangerous function signatures local multiple function bodies named arguments G Gang of Four garbage collection, 2nd, 3rd gcd gen-class, 2nd, 3rd, 4th, 5th, 6th generalized tail-call optimization, 2nd generic genotype gensym get, 2nd, 3rd, 4th get-in getter global hierarchy map goal Gödel, Escher, Bach: An Eternal Golden Braid good-move Graham, Paul, 2nd graphic graphical user interface (GUI), 2nd, 3rd, 4th graphics context greatest common denominator, 2nd green thread Greenspun’s Tenth Rule, 2nd Groovy (programming language) H Halloway, Stuart, 2nd has hash maps hash-map, 2nd, 3rd, 4th Haskell (programming language), 2nd, 3rd, 4th, 5th, 7th, 9th, 11th, 13th out of order execution Template Haskell typeclasses, 2nd heuristic Hickey, Rich, 2nd hidden hierarchy history homoiconicity, 2nd hooks, 2nd hops host semantics Hoyte, Doug hyphens I I/O, 2nd, 3rd idempotent, 2nd identical?


Practical OCaml by Joshua B. Smith

cellular automata, Debian, domain-specific language, duck typing, Free Software Foundation, functional programming, general-purpose programming language, Grace Hopper, higher-order functions, hiring and firing, John Conway, Paul Graham, slashdot, SpamAssassin, text mining, Turing complete, type inference, web application, Y2K

In compiled code, the type inference occurs automatically. 620Xch18final.qxd 9/22/06 12:42 AM Page 231 CHAPTER 18 ■ THE OBJECTIVE PART OF CAML # let np = new polyclass;; val np : '_a polyclass = <obj> # np#polyfunc (Some 10);; - : int = 10 # np;; - : int polyclass = <obj> # Direct Objects In OCaml, objects can be created directly (they are called direct objects in the OCaml documentation). This direction creation is often referred to as duck typing (a play on the phrase “if it walks like a duck and quacks like a duck, it is a duck”). Although direct objects do not have to be classes or need to be instantiated, there are some restrictions. One of the most prominent restrictions on direct objects is that they cannot be inherited from. Direct objects can be useful for prototyping.

See CGI comparator function, 96, 137 compare function, 37, 90, 137 comparison functions, 90 compile-time flags, 310 compiler flags, 405 compilers, 405 Complex library, 29 composite types, 68 composition, vs. inheritance, 238 concat function, 93, 97, 137 620Xidxfinal.qxd 9/22/06 4:19 PM Page 447 ■INDEX ■D data member access, 233 data structures, 51, 252 data types, 225 data-driven applications, 2 data-handling functions, for securities trades database, 54–59 databases creating, 51–60 displaying/importing data and, 73–87 reports and, 73–87 DbC (Design by Contract), 133 deadlocks, 313 debuggers, OCaml and, 404 debugging, threads and, 309, 327 default match, 47 delay function, 311, 317 dependency graphs, 145, 148 Design by Contract (DbC), 133 diamond-shaped inheritance, 241 diff command, 380 difftime function, 359 direct objects, 231 directed graphs, 347 directories, reading, 119 -disntr flag, 408 distance calculations, 21 distance type, 41–44 Division_by_zero exception, 127, 172 -dllib –l<LIBNAME> flag, 407 -dllpath <DIR> flag, 407 documentation extracting from comments, 145 ocamldoc for, 145–154 domain-specific languages (DSLs), 203, 411, 415, 419 dot notation, 17 doubles, copying, 352 double_val(v) function, 352 downloads, multipart file, 278 Doxygen, 145 dreaded diamond, 241 DSLs (domain-specific languages), 203, 411, 415, 419 -dtypes compiler flag, 406 duck typing, 231 Dynalink, 289 dynamic linking of code, 356 ■E eager type, 67 EBNF (Extended Backus-Naur Form), 210 echo servers, 179 Eclipse, 14, 402 edit distance, 243–248 Emacs, 14, 402 email type, 136 emptybmp function, 385 encapsulation, 225, 233, 245 encoding, 65 Endianness, 375 End_of_file exception, 78, 126, 131 entry points, 262, 267 Find it faster at http://superindex.apress.com/ concurrency, 271, 309 Condition module, 318 condition variables, 313, 318 configuration file parser (sample), 415–419 Configure macros, 401 constraining types, 35 constraints, parameterized classes and, 234 constructor arguments, 229 contents attribute, 22 conversions, for distances, 31, 41 Conway’s game, 390 Conway, John, 390 cookies, blog server example and, 283–288 Coq proof assistant, 29 correctness, of programs, 271 Cousineau, Guy, 3 cpp (C preprocessor), 411 CPUs, multiple, 310 create function, 317 arrays and, 97 Condition module and, 318 hashtables and, 100 Thread module and, 311, 316 creating arrays, 97 custom exceptions, 127 custom tags/generators, 153 databases, 51–60 enums, 26 functions, 30–32, 33–36 hashtables, 100 http clients, 120 lists, 96 maps, 109 modules, 156 queues, 103 records, 28 servers, 179 sets, 107 sockets, 120 stacks, 105 threads, 310–316 values, 33–36 curried functions, 17, 36, 41 currying functors, 163 custom exceptions, 127 -custom flag, 407, 409 447 620Xidxfinal.qxd 448 9/22/06 4:19 PM Page 448 ■INDEX enums (enumerated values), 23, 26 eprintf function, 75 equality, floats and, 63 error handling/reporting, 18, 137 errorstring exception and, 414 ocamllex and, 201, 205 revised syntax and, 412 errorstring exception, 414 Euclid’s algorithm, 30 event channels, 315, 319 Event module, 315, 319 events, 315–317, 319 Events module, 340 Ex-nunc framework, 276 examples.


Learning Flask Framework by Matt Copperwaite, Charles Leifer

create, read, update, delete, database schema, Debian, DevOps, don't repeat yourself, duck typing, full text search, place-making, Skype, SQL injection, web application

. >>> duck = mock() >>> when(duck).quack().thenReturn("quack") >>> duck.quack() "quack" In the preceding example, we are creating a mocked up duck object, giving it the ability to quack, and then proving that it can quack. In dynamically typed languages such as Python, where an object you have may not be the one you are expecting, it is common practice to use duck-typing. As the phrase says "if it walks like a duck and quacks like a duck, it must be a duck". This is really useful when creating mocking objects, as it is easy to use a fake Mock object without your methods noticing the switch. The difficulty arises when Flask uses its decorators to run methods before your method is run and you need to override it to, for example, replace the database initiator.


pages: 504 words: 89,238

Natural language processing with Python by Steven Bird, Ewan Klein, Edward Loper

bioinformatics, business intelligence, business logic, Computing Machinery and Intelligence, conceptual framework, Donald Knuth, duck typing, elephant in my pajamas, en.wikipedia.org, finite state, Firefox, functional programming, Guido van Rossum, higher-order functions, information retrieval, language acquisition, lolcat, machine translation, Menlo Park, natural language processing, P = NP, search inside the book, sparse data, speech recognition, statistical model, text mining, Turing test, W. E. B. Du Bois

A recursive function to traverse a tree. def traverse(t): try: t.node except AttributeError: print t, else: 280 | Chapter 7: Extracting Information from Text # Now we know that t.node is defined print '(', t.node, for child in t: traverse(child) print ')', >>> t = nltk.Tree('(S (NP Alice) (VP chased (NP the rabbit)))') >>> traverse(t) ( S ( NP Alice ) ( VP chased ( NP the rabbit ) ) ) We have used a technique called duck typing to detect that t is a tree (i.e., t.node is defined). 7.5 Named Entity Recognition At the start of this chapter, we briefly introduced named entities (NEs). Named entities are definite noun phrases that refer to specific types of individuals, such as organizations, persons, dates, and so on. Table 7-3 lists some of the more commonly used types of NEs.

English, 63 code blocks, nested, 25 code examples, downloading, 57 code points, 94 codecs module, 95 coindex (in feature structure), 340 collocations, 20, 81 comma operator (,), 133 comparative wordlists, 65 comparison operators numerical, 22 for words, 23 complements of lexical head, 347 complements of verbs, 313 complex types, 373 complex values, 336 components, language understanding, 31 computational linguistics, challenges of natural language, 441 computer understanding of sentence meaning, 368 concatenation, 11, 88 lists and strings, 87 strings, 16 conclusions in logic, 369 concordances creating, 40 graphical POS-concordance tool, 184 conditional classifiers, 254 conditional expressions, 25 conditional frequency distributions, 44, 52–56 combining with regular expressions, 103 condition and event pairs, 52 counting words by genre, 52 generating random text with bigrams, 55 male and female names ending in each alphabet letter, 62 plotting and tabulating distributions, 53 using to find minimally contrasting set of words, 64 ConditionalFreqDist, 52 commonly used methods, 56 conditionals, 22, 133 confusion matrix, 207, 240 consecutive classification, 232 non phrase chunking with consecutive classifier, 275 consistent, 366 466 | General Index constituent structure, 296 constituents, 297 context exploiting in part-of-speech classifier, 230 for taggers, 203 context-free grammar, 298, 300 (see also grammars) probabilistic context-free grammar, 320 contractions in tokenization, 112 control, 22 control structures, 26 conversion specifiers, 118 conversions of data formats, 419 coordinate structures, 295 coreferential, 373 corpora, 39–52 annotated text corpora, 46–48 Brown Corpus, 42–44 creating and accessing, resources for further reading, 438 defined, 39 differences in corpus access methods, 50 exploring text corpora using a chunker, 267 Gutenberg Corpus, 39–42 Inaugural Address Corpus, 45 from languages other than English, 48 loading your own corpus, 51 obtaining from Web, 416 Reuters Corpus, 44 sources of, 73 tagged, 181–189 text corpus structure, 49–51 web and chat text, 42 wordlists, 60–63 corpora, included with NLTK, 46 corpus case study, structure of TIMIT, 407–412 corpus HOWTOs, 122 life cycle of, 412–416 creation scenarios, 412 curation versus evolution, 415 quality control, 413 widely-used format for, 421 counters, legitimate uses of, 141 cross-validation, 241 CSV (comma-separated value) format, 418 CSV (comma-separated-value) format, 170 D \d decimal digits in regular expressions, 110 \D nondigit characters in regular expressions, 111 data formats, converting, 419 data types dictionary, 190 documentation for Python standard types, 173 finding type of Python objects, 86 function parameter, 146 operations on objects, 86 database query via natural language, 361–365 databases, obtaining data from, 418 debugger (Python), 158 debugging techniques, 158 decimal integers, formatting, 119 decision nodes, 242 decision stumps, 243 decision trees, 242–245 entropy and information gain, 243 decision-tree classifier, 229 declarative style, 140 decoding, 94 def keyword, 9 defaultdict, 193 defensive programming, 159 demonstratives, agreement with noun, 329 dependencies, 310 criteria for, 312 existential dependencies, modeling in XML, 427 non-projective, 312 projective, 311 unbounded dependency constructions, 349–353 dependency grammars, 310–315 valency and the lexicon, 312 dependents, 310 descriptive models, 255 determiners, 186 agreement with nouns, 333 deve-test set, 225 development set, 225 similarity to test set, 238 dialogue act tagging, 214 dialogue acts, identifying types, 235 dialogue systems (see spoken dialogue systems) dictionaries feature set, 223 feature structures as, 337 pronouncing dictionary, 63–65 Python, 189–198 default, 193 defining, 193 dictionary data type, 190 finding key given a value, 197 indexing lists versus, 189 summary of dictionary methods, 197 updating incrementally, 195 storing features and values, 327 translation, 66 dictionary methods, 197 dictionary data structure (Python), 65 directed acyclic graphs (DAGs), 338 discourse module, 401 discourse semantics, 397–402 discourse processing, 400–402 discourse referents, 397 discourse representation structure (DRS), 397 Discourse Representation Theory (DRT), 397–400 dispersion plot, 6 divide-and-conquer strategy, 160 docstrings, 143 contents and structure of, 148 example of complete docstring, 148 module-level, 155 doctest block, 148 doctest module, 160 document classification, 227 documentation functions, 148 online Python documentation, versions and, 173 Python, resources for further information, 173 docutils module, 148 domain (of a model), 377 DRS (discourse representation structure), 397 DRS conditions, 397 DRT (Discourse Representation Theory), 397– 400 Dublin Core Metadata initiative, 435 duck typing, 281 dynamic programming, 165 General Index | 467 application to parsing with context-free grammar, 307 different approaches to, 167 E Earley chart parser, 334 electronic books, 80 elements, XML, 425 ElementTree interface, 427–429 using to access Toolbox data, 429 elif clause, if . . . elif statement, 133 elif statements, 26 else statements, 26 encoding, 94 encoding features, 223 encoding parameters, codecs module, 95 endangered languages, special considerations with, 423–424 entities, 373 entity detection, using chunking, 264–270 entries adding field to, in Toolbox, 431 contents of, 60 converting data formats, 419 formatting in XML, 430 entropy, 251 (see also Maximum Entropy classifiers) calculating for gender prediction task, 243 maximizing in Maximum Entropy classifier, 252 epytext markup language, 148 equality, 132, 372 equivalence (<->) operator, 368 equivalent, 340 error analysis, 225 errors runtime, 13 sources of, 156 syntax, 3 evaluation sets, 238 events, pairing with conditions in conditional frequency distribution, 52 exceptions, 158 existential quantifier, 374 exists operator, 376 Expected Likelihood Estimation, 249 exporting data, 117 468 | General Index F f-structure, 357 feature extractors defining for dialogue acts, 235 defining for document classification, 228 defining for noun phrase (NP) chunker, 276–278 defining for punctuation, 234 defining for suffix checking, 229 Recognizing Textual Entailment (RTE), 236 selecting relevant features, 224–227 feature paths, 339 feature sets, 223 feature structures, 328 order of features, 337 resources for further reading, 357 feature-based grammars, 327–360 auxiliary verbs and inversion, 348 case and gender in German, 353 example grammar, 333 extending, 344–356 lexical heads, 347 parsing using Earley chart parser, 334 processing feature structures, 337–344 subsumption and unification, 341–344 resources for further reading, 357 subcategorization, 344–347 syntactic agreement, 329–331 terminology, 336 translating from English to SQL, 362 unbounded dependency constructions, 349–353 using attributes and constraints, 331–336 features, 223 non-binary features in naive Bayes classifier, 249 fields, 136 file formats, libraries for, 172 files opening and reading local files, 84 writing program output to, 120 fillers, 349 first-order logic, 372–385 individual variables and assignments, 378 model building, 383 quantifier scope ambiguity, 381 summary of language, 376 syntax, 372–375 theorem proving, 375 truth in model, 377 floating-point numbers, formatting, 119 folds, 241 for statements, 26 combining with if statements, 26 inside a list comprehension, 63 iterating over characters in strings, 90 format strings, 118 formatting program output, 116–121 converting from lists to strings, 116 strings and formats, 117–118 text wrapping, 120 writing results to file, 120 formulas of propositional logic, 368 formulas, type (t), 373 free, 375 Frege’s Principle, 385 frequency distributions, 17, 22 conditional (see conditional frequency distributions) functions defined for, 22 letters, occurrence in strings, 90 functions, 142–154 abstraction provided by, 147 accumulative, 150 as arguments to another function, 149 call-by-value parameter passing, 144 checking parameter types, 146 defined, 9, 57 documentation for Python built-in functions, 173 documenting, 148 errors from, 157 for frequency distributions, 22 for iteration over sequences, 134 generating plurals of nouns (example), 58 higher-order, 151 inputs and outputs, 143 named arguments, 152 naming, 142 poorly-designed, 147 recursive, call structure, 165 saving in modules, 59 variable scope, 145 well-designed, 147 gazetteer, 282 gender identification, 222 Decision Tree model for, 242 gender in German, 353–356 Generalized Phrase Structure Grammar (GPSG), 345 generate_model ( ) function, 55 generation of language output, 29 generative classifiers, 254 generator expressions, 138 functions exemplifying, 151 genres, systematic differences between, 42–44 German, case and gender in, 353–356 gerunds, 211 glyphs, 94 gold standard, 201 government-sponsored challenges to machine learning application in NLP, 257 gradient (grammaticality), 318 grammars, 327 (see also feature-based grammars) chunk grammar, 265 context-free, 298–302 parsing with, 302–310 validating Toolbox entries with, 433 writing your own, 300 dependency, 310–315 development, 315–321 problems with ambiguity, 317 treebanks and grammars, 315–317 weighted grammar, 318–321 dilemmas in sentence structure analysis, 292–295 resources for further reading, 322 scaling up, 315 grammatical category, 328 graphical displays of data conditional frequency distributions, 56 Matplotlib, 168–170 graphs defining and manipulating, 170 directed acyclic graphs, 338 greedy sequence classification, 232 Gutenberg Corpus, 40–42, 80 G hapaxes, 19 hash arrays, 189, 190 (see also dictionaries) gaps, 349 H General Index | 469 head of a sentence, 310 criteria for head and dependencies, 312 heads, lexical, 347 headword (lemma), 60 Heldout Estimation, 249 hexadecimal notation for Unicode string literal, 95 Hidden Markov Models, 233 higher-order functions, 151 holonyms, 70 homonyms, 60 HTML documents, 82 HTML markup, stripping out, 418 hypernyms, 70 searching corpora for, 106 semantic similarity and, 72 hyphens in tokenization, 110 hyponyms, 69 I identifiers for variables, 15 idioms, Python, 24 IDLE (Interactive DeveLopment Environment), 2 if . . . elif statements, 133 if statements, 25 combining with for statements, 26 conditions in, 133 immediate constituents, 297 immutable, 93 implication (->) operator, 368 in operator, 91 Inaugural Address Corpus, 45 inconsistent, 366 indenting code, 138 independence assumption, 248 naivete of, 249 indexes counting from zero (0), 12 list, 12–14 mapping dictionary definition to lexeme, 419 speeding up program by using, 163 string, 15, 89, 91 text index created using a stemmer, 107 words containing a given consonant-vowel pair, 103 inference, 369 information extraction, 261–289 470 | General Index architecture of system, 263 chunking, 264–270 defined, 262 developing and evaluating chunkers, 270– 278 named entity recognition, 281–284 recursion in linguistic structure, 278–281 relation extraction, 284 resources for further reading, 286 information gain, 243 inside, outside, begin tags (see IOB tags) integer ordinal, finding for character, 95 interpreter >>> prompt, 2 accessing, 2 using text editor instead of to write programs, 56 inverted clauses, 348 IOB tags, 269, 286 reading, 270–272 is operator, 145 testing for object identity, 132 ISO 639 language codes, 65 iterative optimization techniques, 251 J joint classifier models, 231 joint-features (maximum entropy model), 252 K Kappa coefficient (k), 414 keys, 65, 191 complex, 196 keyword arguments, 153 Kleene closures, 100 L lambda expressions, 150, 386–390 example, 152 lambda operator (λ), 386 Lancaster stemmer, 107 language codes, 65 language output, generating, 29 language processing, symbol processing versus, 442 language resources describing using OLAC metadata, 435–437 LanguageLog (linguistics blog), 35 latent semantic analysis, 171 Latin-2 character encoding, 94 leaf nodes, 242 left-corner parser, 306 left-recursive, 302 lemmas, 60 lexical relationships between, 71 pairing of synset with a word, 68 lemmatization, 107 example of, 108 length of a text, 7 letter trie, 162 lexical categories, 179 lexical entry, 60 lexical relations, 70 lexical resources comparative wordlists, 65 pronouncing dictionary, 63–65 Shoebox and Toolbox lexicons, 66 wordlist corpora, 60–63 lexicon, 60 (see also lexical resources) chunking Toolbox lexicon, 434 defined, 60 validating in Toolbox, 432–435 LGB rule of name resolution, 145 licensed, 350 likelihood ratios, 224 Linear-Chain Conditional Random Field Models, 233 linguistic objects, mappings from keys to values, 190 linguistic patterns, modeling, 255 linguistics and NLP-related concepts, resources for, 34 list comprehensions, 24 for statement in, 63 function invoked in, 64 used as function parameters, 55 lists, 10 appending item to, 11 concatenating, using + operator, 11 converting to strings, 116 indexing, 12–14 indexing, dictionaries versus, 189 normalizing and sorting, 86 Python list type, 86 sorted, 14 strings versus, 92 tuples versus, 136 local variables, 58 logic first-order, 372–385 natural language, semantics, and, 365–368 propositional, 368–371 resources for further reading, 404 logical constants, 372 logical form, 368 logical proofs, 370 loops, 26 looping with conditions, 26 lowercase, converting text to, 45, 107 M machine learning application to NLP, web pages for government challenges, 257 decision trees, 242–245 Maximum Entropy classifiers, 251–254 naive Bayes classifiers, 246–250 packages, 237 resources for further reading, 257 supervised classification, 221–237 machine translation (MT) limitations of, 30 using NLTK’s babelizer, 30 mapping, 189 Matplotlib package, 168–170 maximal projection, 347 Maximum Entropy classifiers, 251–254 Maximum Entropy Markov Models, 233 Maximum Entropy principle, 253 memoization, 167 meronyms, 70 metadata, 435 OLAC (Open Language Archives Community), 435 modals, 186 model building, 383 model checking, 379 models interpretation of sentences of logical language, 371 of linguistic patterns, 255 representation using set theory, 367 truth-conditional semantics in first-order logic, 377 General Index | 471 what can be learned from models of language, 255 modifiers, 314 modules defined, 59 multimodule programs, 156 structure of Python module, 154 morphological analysis, 213 morphological cues to word category, 211 morphological tagging, 214 morphosyntactic information in tagsets, 212 MSWord, text from, 85 mutable, 93 N \n newline character in regular expressions, 111 n-gram tagging, 203–208 across sentence boundaries, 208 combining taggers, 205 n-gram tagger as generalization of unigram tagger, 203 performance limitations, 206 separating training and test data, 203 storing taggers, 206 unigram tagging, 203 unknown words, 206 naive Bayes assumption, 248 naive Bayes classifier, 246–250 developing for gender identification task, 223 double-counting problem, 250 as generative classifier, 254 naivete of independence assumption, 249 non-binary features, 249 underlying probabilistic model, 248 zero counts and smoothing, 248 name resolution, LGB rule for, 145 named arguments, 152 named entities commonly used types of, 281 relations between, 284 named entity recognition (NER), 281–284 Names Corpus, 61 negative lookahead assertion, 284 NER (see named entity recognition) nested code blocks, 25 NetworkX package, 170 new words in languages, 212 472 | General Index newlines, 84 matching in regular expressions, 109 printing with print statement, 90 resources for further information, 122 non-logical constants, 372 non-standard words, 108 normalizing text, 107–108 lemmatization, 108 using stemmers, 107 noun phrase (NP), 297 noun phrase (NP) chunking, 264 regular expression–based NP chunker, 267 using unigram tagger, 272 noun phrases, quantified, 390 nouns categorizing and tagging, 184 program to find most frequent noun tags, 187 syntactic agreement, 329 numerically intense algorithms in Python, increasing efficiency of, 257 NumPy package, 171 O object references, 130 copying, 132 objective function, 114 objects, finding data type for, 86 OLAC metadata, 74, 435 definition of metadata, 435 Open Language Archives Community, 435 Open Archives Initiative (OAI), 435 open class, 212 open formula, 374 Open Language Archives Community (OLAC), 435 operators, 369 (see also names of individual operators) addition and multiplication, 88 Boolean, 368 numerical comparison, 22 scope of, 157 word comparison, 23 or operator, 24 orthography, 328 out-of-vocabulary items, 206 overfitting, 225, 245 P packages, 59 parameters, 57 call-by-value parameter passing, 144 checking types of, 146 defined, 9 defining for functions, 143 parent nodes, 279 parsing, 318 (see also grammars) with context-free grammar left-corner parser, 306 recursive descent parsing, 303 shift-reduce parsing, 304 well-formed substring tables, 307–310 Earley chart parser, parsing feature-based grammars, 334 parsers, 302 projective dependency parser, 311 part-of-speech tagging (see POS tagging) partial information, 341 parts of speech, 179 PDF text, 85 Penn Treebank Corpus, 51, 315 personal pronouns, 186 philosophical divides in contemporary NLP, 444 phonetics computer-readable phonetic alphabet (SAMPA), 137 phones, 63 resources for further information, 74 phrasal level, 347 phrasal projections, 347 pipeline for NLP, 31 pixel images, 169 plotting functions, Matplotlib, 168 Porter stemmer, 107 POS (part-of-speech) tagging, 179, 208, 229 (see also tagging) differences in POS tagsets, 213 examining word context, 230 finding IOB chunk tag for word's POS tag, 272 in information retrieval, 263 morphology in POS tagsets, 212 resources for further reading, 214 simplified tagset, 183 storing POS tags in tagged corpora, 181 tagged data from four Indian languages, 182 unsimplifed tags, 187 use in noun phrase chunking, 265 using consecutive classifier, 231 pre-sorting, 160 precision, evaluating search tasks for, 239 precision/recall trade-off in information retrieval, 205 predicates (first-order logic), 372 prepositional phrase (PP), 297 prepositional phrase attachment ambiguity, 300 Prepositional Phrase Attachment Corpus, 316 prepositions, 186 present participles, 211 Principle of Compositionality, 385, 443 print statements, 89 newline at end, 90 string formats and, 117 prior probability, 246 probabilistic context-free grammar (PCFG), 320 probabilistic model, naive Bayes classifier, 248 probabilistic parsing, 318 procedural style, 139 processing pipeline (NLP), 86 productions in grammars, 293 rules for writing CFGs for parsing in NLTK, 301 program development, 154–160 debugging techniques, 158 defensive programming, 159 multimodule programs, 156 Python module structure, 154 sources of error, 156 programming style, 139 programs, writing, 129–177 advanced features of functions, 149–154 algorithm design, 160–167 assignment, 130 conditionals, 133 equality, 132 functions, 142–149 resources for further reading, 173 sequences, 133–138 style considerations, 138–142 legitimate uses for counters, 141 procedural versus declarative style, 139 General Index | 473 Python coding style, 138 summary of important points, 172 using Python libraries, 167–172 Project Gutenberg, 80 projections, 347 projective, 311 pronouncing dictionary, 63–65 pronouns anaphoric antecedents, 397 interpreting in first-order logic, 373 resolving in discourse processing, 401 proof goal, 376 properties of linguistic categories, 331 propositional logic, 368–371 Boolean operators, 368 propositional symbols, 368 pruning decision nodes, 245 punctuation, classifier for, 233 Python carriage return and linefeed characters, 80 codecs module, 95 dictionary data structure, 65 dictionary methods, summary of, 197 documentation, 173 documentation and information resources, 34 ElementTree module, 427 errors in understanding semantics of, 157 finding type of any object, 86 getting started, 2 increasing efficiency of numerically intense algorithms, 257 libraries, 167–172 CSV, 170 Matplotlib, 168–170 NetworkX, 170 NumPy, 171 other, 172 reference materials, 122 style guide for Python code, 138 textwrap module, 120 Python Package Index, 172 Q quality control in corpus creation, 413 quantification first-order logic, 373, 380 quantified noun phrases, 390 scope ambiguity, 381, 394–397 474 | General Index quantified formulas, interpretation of, 380 questions, answering, 29 quotation marks in strings, 87 R random text generating in various styles, 6 generating using bigrams, 55 raster (pixel) images, 169 raw strings, 101 raw text, processing, 79–128 capturing user input, 85 detecting word patterns with regular expressions, 97–101 formatting from lists to strings, 116–121 HTML documents, 82 NLP pipeline, 86 normalizing text, 107–108 reading local files, 84 regular expressions for tokenizing text, 109– 112 resources for further reading, 122 RSS feeds, 83 search engine results, 82 segmentation, 112–116 strings, lowest level text processing, 87–93 summary of important points, 121 text from web and from disk, 80 text in binary formats, 85 useful applications of regular expressions, 102–106 using Unicode, 93–97 raw( ) function, 41 re module, 101, 110 recall, evaluating search tasks for, 240 Recognizing Textual Entailment (RTE), 32, 235 exploiting word context, 230 records, 136 recursion, 161 function to compute Sanskrit meter (example), 165 in linguistic structure, 278–281 tree traversal, 280 trees, 279–280 performance and, 163 in syntactic structure, 301 recursive, 301 recursive descent parsing, 303 reentrancy, 340 references (see object references) regression testing framework, 160 regular expressions, 97–106 character class and other symbols, 110 chunker based on, evaluating, 272 extracting word pieces, 102 finding word stems, 104 matching initial and final vowel sequences and all consonants, 102 metacharacters, 101 metacharacters, summary of, 101 noun phrase (NP) chunker based on, 265 ranges and closures, 99 resources for further information, 122 searching tokenized text, 105 symbols, 110 tagger, 199 tokenizing text, 109–112 use in PlaintextCorpusReader, 51 using basic metacharacters, 98 using for relation extraction, 284 using with conditional frequency distributions, 103 relation detection, 263 relation extraction, 284 relational operators, 22 reserved words, 15 return statements, 144 return value, 57 reusing code, 56–59 creating programs using a text editor, 56 functions, 57 modules, 59 Reuters Corpus, 44 root element (XML), 427 root hypernyms, 70 root node, 242 root synsets, 69 Rotokas language, 66 extracting all consonant-vowel sequences from words, 103 Toolbox file containing lexicon, 429 RSS feeds, 83 feedparser library, 172 RTE (Recognizing Textual Entailment), 32, 235 exploiting word context, 230 runtime errors, 13 S \s whitespace characters in regular expressions, 111 \S nonwhitespace characters in regular expressions, 111 SAMPA computer-readable phonetic alphabet, 137 Sanskrit meter, computing, 165 satisfies, 379 scope of quantifiers, 381 scope of variables, 145 searches binary search, 160 evaluating for precision and recall, 239 processing search engine results, 82 using POS tags, 187 segmentation, 112–116 in chunking and tokenization, 264 sentence, 112 word, 113–116 semantic cues to word category, 211 semantic interpretations, NLTK functions for, 393 semantic role labeling, 29 semantics natural language, logic and, 365–368 natural language, resources for information, 403 semantics of English sentences, 385–397 quantifier ambiguity, 394–397 transitive verbs, 391–394 ⋏-calculus, 386–390 SemCor tagging, 214 sentence boundaries, tagging across, 208 sentence segmentation, 112, 233 in chunking, 264 in information retrieval process, 263 sentence structure, analyzing, 291–326 context-free grammar, 298–302 dependencies and dependency grammar, 310–315 grammar development, 315–321 grammatical dilemmas, 292 parsing with context-free grammar, 302– 310 resources for further reading, 322 summary of important points, 321 syntax, 295–298 sents( ) function, 41 General Index | 475 sequence classification, 231–233 other methods, 233 POS tagging with consecutive classifier, 232 sequence iteration, 134 sequences, 133–138 combining different sequence types, 136 converting between sequence types, 135 operations on sequence types, 134 processing using generator expressions, 137 strings and lists as, 92 shift operation, 305 shift-reduce parsing, 304 Shoebox, 66, 412 sibling nodes, 279 signature, 373 similarity, semantic, 71 Sinica Treebank Corpus, 316 slash categories, 350 slicing lists, 12, 13 strings, 15, 90 smoothing, 249 space-time trade-offs in algorihm design, 163 spaces, matching in regular expressions, 109 Speech Synthesis Markup Language (W3C SSML), 214 spellcheckers, Words Corpus used by, 60 spoken dialogue systems, 31 spreadsheets, obtaining data from, 418 SQL (Structured Query Language), 362 translating English sentence to, 362 stack trace, 158 standards for linguistic data creation, 421 standoff annotation, 415, 421 start symbol for grammars, 298, 334 startswith( ) function, 45 stemming, 107 NLTK HOWTO, 122 stemmers, 107 using regular expressions, 104 using stem( ) fuinction, 105 stopwords, 60 stress (in pronunciation), 64 string formatting expressions, 117 string literals, Unicode string literal in Python, 95 strings, 15, 87–93 476 | General Index accessing individual characters, 89 accessing substrings, 90 basic operations with, 87–89 converting lists to, 116 formats, 117–118 formatting lining things up, 118 tabulating data, 119 immutability of, 93 lists versus, 92 methods, 92 more operations on, useful string methods, 92 printing, 89 Python’s str data type, 86 regular expressions as, 101 tokenizing, 86 structurally ambiguous sentences, 300 structure sharing, 340 interaction with unification, 343 structured data, 261 style guide for Python code, 138 stylistics, 43 subcategories of verbs, 314 subcategorization, 344–347 substrings (WFST), 307 substrings, accessing, 90 subsumes, 341 subsumption, 341–344 suffixes, classifier for, 229 supervised classification, 222–237 choosing features, 224–227 documents, 227 exploiting context, 230 gender identification, 222 identifying dialogue act types, 235 part-of-speech tagging, 229 Recognizing Textual Entailment (RTE), 235 scaling up to large datasets, 237 sentence segmentation, 233 sequence classification, 231–233 Swadesh wordlists, 65 symbol processing, language processing versus, 442 synonyms, 67 synsets, 67 semantic similarity, 71 in WordNet concept hierarchy, 69 syntactic agreement, 329–331 syntactic cues to word category, 211 syntactic structure, recursion in, 301 syntax, 295–298 syntax errors, 3 T \t tab character in regular expressions, 111 T9 system, entering text on mobile phones, 99 tabs avoiding in code indentation, 138 matching in regular expressions, 109 tag patterns, 266 matching, precedence in, 267 tagging, 179–219 adjectives and adverbs, 186 combining taggers, 205 default tagger, 198 evaluating tagger performance, 201 exploring tagged corpora, 187–189 lookup tagger, 200–201 mapping words to tags using Python dictionaries, 189–198 nouns, 184 part-of-speech (POS) tagging, 229 performance limitations, 206 reading tagged corpora, 181 regular expression tagger, 199 representing tagged tokens, 181 resources for further reading, 214 across sentence boundaries, 208 separating training and testing data, 203 simplified part-of-speech tagset, 183 storing taggers, 206 transformation-based, 208–210 unigram tagging, 202 unknown words, 206 unsimplified POS tags, 187 using POS (part-of-speech) tagger, 179 verbs, 185 tags in feature structures, 340 IOB tags representing chunk structures, 269 XML, 425 tagsets, 179 morphosyntactic information in POS tagsets, 212 simplified POS tagset, 183 terms (first-order logic), 372 test sets, 44, 223 choosing for classification models, 238 testing classifier for document classification, 228 text, 1 computing statistics from, 16–22 counting vocabulary, 7–10 entering on mobile phones (T9 system), 99 as lists of words, 10–16 searching, 4–7 examining common contexts, 5 text alignment, 30 text editor, creating programs with, 56 textonyms, 99 textual entailment, 32 textwrap module, 120 theorem proving in first order logic, 375 timeit module, 164 TIMIT Corpus, 407–412 tokenization, 80 chunking and, 264 in information retrieval, 263 issues with, 111 list produced from tokenizing string, 86 regular expressions for, 109–112 representing tagged tokens, 181 segmentation and, 112 with Unicode strings as input and output, 97 tokenized text, searching, 105 tokens, 8 Toolbox, 66, 412, 431–435 accessing data from XML, using ElementTree, 429 adding field to each entry, 431 resources for further reading, 438 validating lexicon, 432–435 tools for creation, publication, and use of linguistic data, 421 top-down approach to dynamic programming, 167 top-down parsing, 304 total likelihood, 251 training classifier, 223 classifier for document classification, 228 classifier-based chunkers, 274–278 taggers, 203 General Index | 477 unigram chunker using CoNLL 2000 Chunking Corpus, 273 training sets, 223, 225 transformation-based tagging, 208–210 transitive verbs, 314, 391–394 translations comparative wordlists, 66 machine (see machine translation) treebanks, 315–317 trees, 279–281 representing chunks, 270 traversal of, 280 trie, 162 trigram taggers, 204 truth conditions, 368 truth-conditional semantics in first-order logic, 377 tuples, 133 lists versus, 136 parentheses with, 134 representing tagged tokens, 181 Turing Test, 31, 368 type-raising, 390 type-token distinction, 8 TypeError, 157 types, 8, 86 (see also data types) types (first-order logic), 373 U unary predicate, 372 unbounded dependency constructions, 349– 353 defined, 350 underspecified, 333 Unicode, 93–97 decoding and encoding, 94 definition and description of, 94 extracting gfrom files, 94 resources for further information, 122 using your local encoding in Python, 97 unicodedata module, 96 unification, 342–344 unigram taggers confusion matrix for, 240 noun phrase chunking with, 272 unigram tagging, 202 lookup tagger (example), 200 separating training and test data, 203 478 | General Index unique beginners, 69 Universal Feed Parser, 83 universal quantifier, 374 unknown words, tagging, 206 updating dictionary incrementally, 195 US Presidential Inaugural Addresses Corpus, 45 user input, capturing, 85 V valencies, 313 validity of arguments, 369 validity of XML documents, 426 valuation, 377 examining quantifier scope ambiguity, 381 Mace4 model converted to, 384 valuation function, 377 values, 191 complex, 196 variables arguments of predicates in first-order logic, 373 assignment, 378 bound by quantifiers in first-order logic, 373 defining, 14 local, 58 naming, 15 relabeling bound variables, 389 satisfaction of, using to interpret quantified formulas, 380 scope of, 145 verb phrase (VP), 297 verbs agreement paradigm for English regular verbs, 329 auxiliary, 336 auxiliary verbs and inversion of subject and verb, 348 categorizing and tagging, 185 examining for dependency grammar, 312 head of sentence and dependencies, 310 present participle, 211 transitive, 391–394 W \W non-word characters in Python, 110, 111 \w word characters in Python, 110, 111 web text, 42 Web, obtaining data from, 416 websites, obtaining corpora from, 416 weighted grammars, 318–321 probabilistic context-free grammar (PCFG), 320 well-formed (XML), 425 well-formed formulas, 368 well-formed substring tables (WFST), 307– 310 whitespace regular expression characters for, 109 tokenizing text on, 109 wildcard symbol (.), 98 windowdiff scorer, 414 word classes, 179 word comparison operators, 23 word occurrence, counting in text, 8 word offset, 45 word processor files, obtaining data from, 417 word segmentation, 113–116 word sense disambiguation, 28 word sequences, 7 wordlist corpora, 60–63 WordNet, 67–73 concept hierarchy, 69 lemmatizer, 108 more lexical relations, 70 semantic similarity, 71 visualization of hypernym hierarchy using Matplotlib and NetworkX, 170 Words Corpus, 60 words( ) function, 40 wrapping text, 120 Z zero counts (naive Bayes classifier), 249 zero projection, 347 X XML, 425–431 ElementTree interface, 427–429 formatting entries, 430 representation of lexical entry from chunk parsing Toolbox record, 434 resources for further reading, 438 role of, in using to represent linguistic structures, 426 using ElementTree to access Toolbox data, 429 using for linguistic structures, 425 validity of documents, 426 General Index | 479 About the Authors Steven Bird is Associate Professor in the Department of Computer Science and Software Engineering at the University of Melbourne, and Senior Research Associate in the Linguistic Data Consortium at the University of Pennsylvania.


pages: 821 words: 178,631

The Rust Programming Language by Steve Klabnik, Carol Nichols

anti-pattern, billion-dollar mistake, bioinformatics, business logic, business process, cryptocurrency, data science, DevOps, duck typing, Firefox, functional programming, Internet of things, iterative process, pull request, reproducible builds, Ruby on Rails, type inference

[ String::from("Yes"), String::from("Maybe"), String::from("No") ], }), Box::new(Button { width: 50, height: 10, label: String::from("OK"), }), ], }; screen.run(); } Listing 17-9: Using trait objects to store values of different types that implement the same trait When we wrote the library, we didn’t know that someone might add the SelectBox type, but our Screen implementation was able to operate on the new type and draw it because SelectBox implements the Draw type, which means it implements the draw method. This concept—of being concerned only with the messages a value responds to rather than the value’s concrete type—is similar to the concept duck typing in dynamically typed languages: if it walks like a duck and quacks like a duck, then it must be a duck! In the implementation of run on Screen in Listing 17-5, run doesn’t need to know what the concrete type of each component is. It doesn’t check whether a component is an instance of a Button or a SelectBox; it just calls the draw method on the component.

It doesn’t check whether a component is an instance of a Button or a SelectBox; it just calls the draw method on the component. By specifying Box<Draw> as the type of the values in the components vector, we’ve defined Screen to need values that we can call the draw method on. The advantage of using trait objects and Rust’s type system to write code similar to code using duck typing is that we never have to check whether a value implements a particular method at runtime or worry about getting errors if a value doesn’t implement a method but we call it anyway. Rust won’t compile our code if the values don’t implement the traits that the trait objects need. For example, Listing 17-10 shows what happens if we try to create a Screen with a String as a component.


pages: 1,331 words: 183,137

Programming Rust: Fast, Safe Systems Development by Jim Blandy, Jason Orendorff

bioinformatics, bitcoin, Donald Knuth, duck typing, Elon Musk, Firefox, fizzbuzz, functional programming, mandelbrot fractal, Morris worm, MVC pattern, natural language processing, reproducible builds, side project, sorting algorithm, speech recognition, Turing test, type inference, WebSocket

Functions can be generic: when a function’s purpose and implementation are general enough, you can define it to work on any set of types that meet the necessary criteria. A single definition can cover an open-ended set of use cases. In Python and JavaScript, all functions work this way naturally: a function can operate on any value that has the properties and methods the function will need. (This is the characteristic often called duck typing: if it quacks like a duck, it’s a duck.) But it’s exactly this flexibility that makes it so difficult for those languages to detect type errors early; testing is often the only way to catch such mistakes. Rust’s generic functions give the language a degree of the same flexibility, while still catching all type errors at compile time.

Had we known, we could have added num to our Cargo.toml and written: use num::Num; fn dot<N: Num + Copy>(v1: &[N], v2: &[N]) -> N { let mut total = N::zero(); for i in 0 .. v1.len() { total = total + v1[i] * v2[i]; } total } Just as in object-oriented programming, the right interface makes everything nice, in generic programming, the right trait makes everything nice. Still, why go to all this trouble? Why didn’t Rust’s designers make the generics more like C++ templates, where the constraints are left implicit in the code, à la “duck typing?” One advantage of Rust’s approach is forward compatibility of generic code. You can change the implementation of a public generic function or method, and if you didn’t change the signature, you haven’t broken any of its users. Another advantage of bounds is that when you do get a compiler error, at least the compiler can tell you where the trouble is.


pages: 536 words: 73,482

Programming Clojure by Stuart Halloway, Aaron Bedra

continuous integration, duck typing, en.wikipedia.org, functional programming, general-purpose programming language, Gödel, Escher, Bach, higher-order functions, Neal Stephenson, Paul Graham, Ruby on Rails, type inference, web application

Using Primitives for Performance In the preceding sections, function parameters carry no type information. Clojure simply does the right thing. Depending on your perspective, this is either a strength or a weakness. It’s a strength, because your code is clean and simple and can take advantage of duck typing. But it’s also a weakness, because a reader of the code cannot be certain of datatypes and because doing the right thing carries some performance overhead. Consider a function that calculates the sum of the numbers from 1 to n: ​; performance demo only, don't write code like this​ ​(defn sum-to [n] (loop [i 1 sum 0]​ ​ (if (<= i n) (recur (inc i) (+ i sum)) sum)))​ You can verify that this function works with a small input value: ​(sum-to 10)​ ​=> 55​ Let’s see how sum-to performs.


Hands-On Machine Learning With Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurelien Geron

AlphaGo, Amazon Mechanical Turk, Bayesian statistics, centre right, combinatorial explosion, constrained optimization, correlation coefficient, crowdsourcing, data science, deep learning, DeepMind, duck typing, en.wikipedia.org, Geoffrey Hinton, iterative process, Netflix Prize, NP-complete, optical character recognition, P = NP, p-value, pattern recognition, performance metric, recommendation engine, self-driving car, SpamAssassin, speech recognition, statistical model

Custom Transformers Although Scikit-Learn provides many useful transformers, you will need to write your own for tasks such as custom cleanup operations or combining specific attributes. You will want your transformer to work seamlessly with Scikit-Learn functionalities (such as pipelines), and since Scikit-Learn relies on duck typing (not inheritance), all you need is to create a class and implement three methods: fit() (returning self), transform(), and fit_transform(). You can get the last one for free by simply adding TransformerMixin as a base class. Also, if you add BaseEstimator as a base class (and avoid *args and **kargs in your constructor) you will get two extra methods (get_params() and set_params()) that will be useful for automatic hyperparameter tuning.


pages: 1,065 words: 229,099

Real World Haskell by Bryan O'Sullivan, John Goerzen, Donald Stewart, Donald Bruce Stewart

bash_history, database schema, Debian, distributed revision control, domain-specific language, duck typing, en.wikipedia.org, Firefox, functional programming, general-purpose programming language, Guido van Rossum, higher-order functions, job automation, Larry Wall, lateral thinking, level 1 cache, machine readable, p-value, panic early, plutocrats, revision control, sorting algorithm, SQL injection, transfer pricing, type inference, web application, Yochai Benkler

The (&&) operator requires each of its operands to be of type Bool, and its left operand indeed has this type. Since the actual type of "false" does not match the required type, the compiler rejects this expression as ill typed. Static typing can occasionally make it difficult to write some useful kinds of code. In languages such as Python, duck typing is common, where an object acts enough like another to be used as a substitute for it.[2] Fortunately, Haskell’s system of typeclasses, which we will cover in Chapter 6, provides almost all of the benefits of dynamic typing, in a safe and convenient form. Haskell has some support for programming with truly dynamic types, though it is not quite as easy as it is in a language that wholeheartedly embraces the notion.

combinator functions, Gluing Predicates Together combining functions, Getting started with the API command-line, Command-Line Editing in ghci, A Simple Command-Line Framework, Reading Command-Line Arguments arguments, reading, Reading Command-Line Arguments editing, Command-Line Editing in ghci commas (,), Lists, Useful Composite Data Types: Lists and Tuples, First Steps with Parsec: Simple CSV Parsing CSV files and, First Steps with Parsec: Simple CSV Parsing tuples, writing, Useful Composite Data Types: Lists and Tuples compact function, Compact Rendering comparison operators, Boolean Logic, Operators, and Value Comparisons, Equality, Ordering, and Comparisons compilers, Your Haskell Environment, Compiling Haskell Source Glasgow Haskell, Your Haskell Environment components (types), Defining a New Data Type composable functors, Thinking More About Functors composite data types, Useful Composite Data Types: Lists and Tuples concat function, More Simple List Manipulations, Left Folds, Laziness, and Space Leaks, The List Monad concurrent programs, Concurrent and Multicore Programming–Conclusions, Hiding Latency, The Main Thread and Waiting for Other Threads–Communicating over Channels, Shared-State Concurrency Is Still Hard–Using Multiple Cores with GHC, A Concurrent Web Link Checker latency, hiding, Hiding Latency main thread waiting for other threads, The Main Thread and Waiting for Other Threads–Communicating over Channels shared-state, Shared-State Concurrency Is Still Hard–Using Multiple Cores with GHC conditional evaluation, Conditional Evaluation–Understanding Evaluation by Example, Conditional Evaluation with Guards constant applicative forms (CAFs), Time Profiling constants, binding C to Haskell, Binding to Constants constraints, Constraints on Type Definitions Are Bad, Constraints on Our Decoder decoding, Constraints on Our Decoder type definitions and, Constraints on Type Definitions Are Bad constructors, Construction and Deconstruction Content-Length field, Parsing Headers continuations, Parsing Headers control-with-character escapes, Control-with-Character Escapes Control.Applicative module, Infix Use of fmap Control.Arrow module, Another Round of Golf control.Concurrent module, Initializing the GUI, Concurrent Programming with Threads concurrent programming with threads, Concurrent Programming with Threads Control.Exception module, The Acquire-Use-Release Cycle, First Steps with Exceptions, Selective Handling of Exceptions Control.Monad module, Another Way of Looking at Monads, Generalized Lifting, Failing Safely with MonadPlus, Writing Tighter Code lifting, Generalized Lifting MonadPlus typeclass and, Failing Safely with MonadPlus Control.Monad.Error module, Usage of the Maybe monad, Monadic use of Either, Error Handling in Monads Control.Monad.Trans module, Designing for Unexpected Uses Control.Parallel module, Transforming Our Code into Parallel Code Control.Parallel.Strategies module, Separating Algorithm from Evaluation Coordinated Universal Time (UTC), ClockTime and CalendarTime cores, Using Multiple Cores with GHC, Understanding Core–Advanced Techniques: Fusion using multiple, Using Multiple Cores with GHC cos function, Numeric Types countEntries function, Stacking Multiple Monad Transformers CSV files, First Steps with Parsec: Simple CSV Parsing–The sepBy and endBy Combinators, The sepBy and endBy Combinators Parsec helper functions and, The sepBy and endBy Combinators ctTZName function, Using CalendarTime ctWDay function, Using CalendarTime ctYDay function, Using CalendarTime currying, using partial functions, Partial Function Application and Currying custom data types for errors, Custom data types for errors c_sin function, Foreign Language Bindings: The Basics D dash (-), as a range character, Filename Matching data keyword, Defining a New Data Type, How to Give a Type a New Identity newtype keyword and, How to Give a Type a New Identity data structures, Defining a New Data Type, The structure, Association Lists–General-Purpose Sequences, Functions Are Data, Too, Taking Advantage of Functions as Data–General-Purpose Sequences functions and, Functions Are Data, Too, Taking Advantage of Functions as Data–General-Purpose Sequences taking advantage of, Taking Advantage of Functions as Data–General-Purpose Sequences data type, defining, Defining a New Data Type–Type Synonyms (see also types) Data.Array module, Introducing Arrays, Folding over Arrays barcode recognition and, Introducing Arrays folding over arrays, Folding over Arrays Data.Bits module, Pretty Printing a String Data.ByteString.Char8 module, Text I/O, The Real Deal: Compiling and Matching Regular Expressions Data.ByteString.Lazy.Char8 module, Text I/O Data.Char module, Transforming Every Piece of Input Data.Dynamic module, Dynamic Exceptions Data.Foldable module, General-Purpose Sequences, Interference with Pure Code Data.Function module, Remembering a Match’s Parity Data.List module, As-patterns, Strictness and Tail Recursion tails function, As-patterns Data.List.lookup function, Association Lists Data.Map module, A Brief Introduction to Maps, Maps–Functions Are Data, Too Data.Monoid module, Lists, Difference Lists, and Monoids Data.Ratio module, Getting Started with ghci, the Interpreter Data.Sequence module, General-Purpose Sequences Data.Traversable module, Interference with Pure Code Data.Typeable module, Dynamic Exceptions database engines, Overview of HDBC Database.HDBC module, Initializing the GUI databases, Using Databases–Error Handling, Connecting to Databases, Simple Queries, Lazy Reading, Database Metadata connecting, Connecting to Databases lazy reading, Lazy Reading metadata, Database Metadata queries, Simple Queries dates, Dates and Times–Extended Example: Piping dates and times, Dates and Times–Extended Example: Piping Daylight Saving Time (DST), ClockTime and CalendarTime -ddump-asm compiler flag, Tuning the Generated Assembly -ddump-simpl compiler flag, Understanding Core, Profile-Driven Performance Tuning deadlocks, Safely Modifying an MVar, Deadlock Dean, Jeffrey, Finding the Most Popular URLs Debian Linux, installing GHC/Haskell libraries, Ubuntu and Debian Linux debugging, Boolean Logic, Operators, and Value Comparisons declarations (module), The Anatomy of a Haskell Module decoding barcodes, Encoding an EAN-13 Barcode deconstructors, Construction and Deconstruction delete function, Getting started with the API DeriveDataTypeable language, Dynamic Exceptions describeTable function, Database Metadata DiffArray type, Modifying Array Elements diffClockTimes function, TimeDiff for ClockTime directories, Directory and File Information disconnect function, Connecting to Databases discriminated unions, The discriminated union div function, Numeric Types do keyword, A Simple Command-Line Framework, Sequencing, Desugaring of do Blocks Monads and, Desugaring of do Blocks sequencing and, Sequencing Doc data type, Generating Test Data doskey command (ghci), Command-Line Editing in ghci double hashing, Turning Two Hashes into Many double quotes ("), writing strings, Strings and Characters, Writing Character and String Literals Double value, Some Common Basic Types, Numeric Types drivers (HDBC), installing, Installing HDBC and Drivers drop function, Functions over Lists and Tuples, Conditional Evaluation dropWhile function, Working with Sublists DST (Daylight Saving Time), ClockTime and CalendarTime duck typing, Static Types dynamic exceptions, Dynamic Exceptions–Error Handling in Monads E EAN-13 barcodes, A Little Bit About Barcodes easyList function, Testing with QuickCheck Either type, Motivation: Boilerplate Avoidance, Use of Either–Exceptions, Monadic use of Either monadic use of, Monadic use of Either elem function, Searching Lists elements function, Generating Test Data ELF object files, Binary I/O and Qualified Imports else keyword, Conditional Evaluation embedded domain specific languages, A Domain-Specific Language for Predicates–Controlling Traversal EmptyDataDecls language extension, Typed Pointers enclose function, Pretty Printing a String endBy function, The sepBy and endBy Combinators #enum construct, Automating the Binding enum keyword (C/C++), The enumeration Enum typeclass, Using CalendarTime enumeration notation, Lists enumeration types, The enumeration environment (programming), Your Haskell Environment environment variables, Environment Variables EOF (end of file), Working with Files and Handles eol function, Lookahead equality tests, The Need for Typeclasses, Equality, Ordering, and Comparisons error function, Handling Errors Through API Design errors, Boolean Logic, Operators, and Value Comparisons, Boolean Logic, Operators, and Value Comparisons, Strong Types, Algebraic Data Types, Reporting Errors, Type Inference Is a Double-Edged Sword, More Helpful Errors, Standard Input, Output, and Error, Handling Errors Through API Design, Reporting Parse Errors, Error Handling, Error Handling–Exceptions, Error Handling with Data Types–Exceptions, Custom data types for errors, Error Handling in Monads, Error Handling API design, handling, Handling Errors Through API Design compiling source code, Type Inference Is a Double-Edged Sword custom data types for, Custom data types for errors handling, Error Handling–Exceptions, Error Handling with Data Types–Exceptions, Error Handling in Monads, Error Handling data types, Error Handling with Data Types–Exceptions databases, Error Handling monads, Error Handling in Monads I/O and, Standard Input, Output, and Error messages, Boolean Logic, Operators, and Value Comparisons, Boolean Logic, Operators, and Value Comparisons, Algebraic Data Types Boolean values and, Boolean Logic, Operators, and Value Comparisons No instance, Boolean Logic, Operators, and Value Comparisons, Algebraic Data Types parsers, handling, Error Handling reporting, Reporting Errors typeclasses, Strong Types, More Helpful Errors ErrorT transformer, Error Handling in Monads escape characters, Strings and Characters escaping text, Escaping Text /etc/passwd file, Extended Example: /etc/passwd–Extended Example: Numeric Types evaluation, Understanding Evaluation by Example–Polymorphism in Haskell, Conditional Evaluation with Guards, Space Leaks and Strict Evaluation–Learning to Use seq conditional with guards, Conditional Evaluation with Guards strict, Space Leaks and Strict Evaluation–Learning to Use seq evaluation strategies, Separating Algorithm from Evaluation event-driven programming, Event-Driven Programming Exception type, First Steps with Exceptions exceptions, Error Handling, Exceptions–Error Handling in Monads, Selective Handling of Exceptions, I/O Exceptions, Throwing Exceptions, Dynamic Exceptions–Error Handling in Monads dynamic, Dynamic Exceptions–Error Handling in Monads I/O (input/output), I/O Exceptions selective handling of, Selective Handling of Exceptions throwing, Throwing Exceptions --exclude flag (hpc), Measuring Test Coverage with HPC executables, creating, Generating a Haskell Program and Importing Modules executeFile function, Using Pipes for Redirection exhaustive patterns, Exhaustive Patterns and Wild Cards explicit recursion, Explicit Recursion exponentiation (**) operator, Undefined Values, and Introducing Variables, Numeric Types exports, The Anatomy of a Haskell Module Exposed-Modules field, Writing a Package Description expressions, Passing an Expression to a Function, Introducing Local Variables functions, passing to, Passing an Expression to a Function let blocks and, Introducing Local Variables external programs, running, Running External Programs extract methods, The Monad Laws and Good Coding Style F fail function, The Monad Typeclass False Boolean value, Boolean Logic, Operators, and Value Comparisons FDs (file descriptors), Using Pipes for Redirection Fedora Linux, installing GHC/Haskell libraries, Fedora Linux fetchAllRowsAL’ function, Lazy Reading fetchAllRows’ function, Reading with Statements FFI (Haskell Foreign Function Interface), Interfacing with C: The FFI–The Real Deal: Compiling and Matching Regular Expressions FFI binding, Compilation Options and Interfacing to C fFlush function, Flushing The Buffer file descriptors (FDs), Using Pipes for Redirection file processing, Efficient File Processing–Putting Our Code to Work filename matching, Filename Matching files, Working with Files and Handles–Extended Example: Functional I/O and Temporary Files, Deleting and Renaming Files, Temporary Files, Efficient File Processing, Filename Matching, Sizing a File Safely–A Domain-Specific Language for Predicates, Directory and File Information, File Modification Times deleting/renaming, Deleting and Renaming Files filename matching, Filename Matching modification times, File Modification Times processing, Efficient File Processing (see file processing) sizing safely, Sizing a File Safely–A Domain-Specific Language for Predicates System.Directory module, using, Directory and File Information temporary, Temporary Files filesystems, I/O Case Study: A Library for Searching the Filesystem–Common Layout Styles searching, I/O Case Study: A Library for Searching the Filesystem–Common Layout Styles filter function, Searching Lists, Selecting Pieces of Input, Filters with interact interact, Filters with interact find command, I/O Case Study: A Library for Searching the Filesystem first function, Another Round of Golf flex, Using Parsec Float type, Numeric Types floating-point numbers, Simple Arithmetic, Lists enumerating, Lists fmap function, Infix Use of fmap, Monads and Functors, Moving Down the Stack monads and, Monads and Functors fold functions, Computing One Answer over a Collection, The Left Fold, Folding from the Right–Left Folds, Laziness, and Space Leaks folding from left, The Left Fold folding from right, Folding from the Right–Left Folds, Laziness, and Space Leaks foldl function, The Left Fold, Folding from the Right–Left Folds, Laziness, and Space Leaks, Left Folds, Laziness, and Space Leaks, Strictness and Tail Recursion foldr function and, Folding from the Right–Left Folds, Laziness, and Space Leaks laziness and space leaks, Left Folds, Laziness, and Space Leaks foldr function, Computing One Answer over a Collection, Folding from the Right–Left Folds, Laziness, and Space Leaks fold’ function, Strictness and Tail Recursion force function, Knowing What to Evaluate in Parallel foreign import declarations, Foreign Language Bindings: The Basics Foreign modules, Foreign Language Bindings: The Basics–Regular Expressions for Haskell: A Binding for PCRE Foreign.C.String module, Foreign Language Bindings: The Basics, Passing String Data Between Haskell and C Foreign.Marshal.Array module, Foreign Language Bindings: The Basics Foreign.Ptr module, Foreign Language Bindings: The Basics ForeignPtr type, Memory Management: Let the Garbage Collector Do the Work forkManaged function, Safe Resource Management: A Good Idea, and Easy Besides forkProcess function, Using Pipes for Redirection forM function, Why Provide Both mapM and forM?


Exploring ES6 - Upgrade to the next version of JavaScript by Axel Rauschmayer

anti-pattern, domain-specific language, duck typing, en.wikipedia.org, Firefox, functional programming, Google Chrome, MVC pattern, web application, WebSocket

. […] regular expressions, better string handling, new control statements, try/catch exception handling, tighter definition of errors, formatting for numeric output and other enhancements. [1] ¹⁴http://www.adaptivepath.com/ideas/ajax-new-approach-web-applications/ About ECMAScript 6 (ES6) 7 • ECMAScript 4 was designed by Adobe, Mozilla, Opera, and Google and was a massive upgrade. Its planned feature sets included: – Programming in the large (classes, interfaces, namespaces, packages, program units, optional type annotations, and optional static type checking and verification) – Evolutionary programming and scripting (structural types, duck typing, type definitions, and multimethods) – Data structure construction (parameterized types, getters and setters, and meta-level methods) – Control abstractions (proper tail calls, iterators, and generators) – Introspection (type meta-objects and stack marks) • ECMAScript 3.1 was designed by Microsoft and Yahoo.


pages: 554 words: 108,035

Scala in Depth by Tom Kleenex, Joshua Suereth

discrete time, domain-specific language, duck typing, fault tolerance, functional programming, higher-order functions, MVC pattern, sorting algorithm, type inference

It’s a great tool to understand and use when running into nested types that seem inexpressible, but it shouldn’t be needed in most situations. 6.6. Summary In this chapter, you learned the basic rules governing Scala’s type system. We learned how to define types and combine them. We looked at structural typing and how you can use it to emulate duck typing in dynamic languages. We learned how to create generic types using type parameters and how to enforce upper and lower bounds on types. We looked at higher-kinded types and type lambdas and how you can use them to simplify complex types. We also looked into variance and how to create flexible parameterized classes.


pages: 444 words: 118,393

The Nature of Software Development: Keep It Simple, Make It Valuable, Build It Piece by Piece by Ron Jeffries

Amazon Web Services, anti-pattern, bitcoin, business cycle, business intelligence, business logic, business process, c2.com, call centre, cloud computing, continuous integration, Conway's law, creative destruction, dark matter, data science, database schema, deep learning, DevOps, disinformation, duck typing, en.wikipedia.org, fail fast, fault tolerance, Firefox, Hacker News, industrial robot, information security, Infrastructure as a Service, Internet of things, Jeff Bezos, Kanban, Kubernetes, load shedding, loose coupling, machine readable, Mars Rover, microservices, Minecraft, minimum viable product, MITM: man-in-the-middle, Morris worm, move fast and break things, OSI model, peer-to-peer lending, platform as a service, power law, ransomware, revision control, Ruby on Rails, Schrödinger's Cat, Silicon Valley, six sigma, software is eating the world, source of truth, SQL injection, systems thinking, text mining, time value of money, transaction costs, Turing machine, two-pizza team, web application, zero day

Either it’ll go into a different (possibly covert) database or it just won’t be represented anywhere. Instead of creating a single system of record for any given concept, we should think in terms of federated zones of authority. We allow different systems to own their own data, but we emphasize interchange via common formats and representations. Think of this like duck-typing for the enterprise. If you can exchange a URL for a representation that you can use like a customer, then as far as you care, it is a customer service, whether the data came from a database or a static file. Avoid Concept Leakage An electronics retailer was late to the digital music party.


pages: 779 words: 116,439

Test-Driven Development With Python by Harry J. W. Percival

business logic, continuous integration, database schema, Debian, DevOps, don't repeat yourself, duck typing, Firefox, loose coupling, MVC pattern, off-by-one error, platform as a service, pull request, web application, WebSocket

def test_list_name_is_first_item_text(self): list_ = List.objects.create() Item.objects.create(list=list_, text='first item') Item.objects.create(list=list_, text='second item') self.assertEqual(list_.name, 'first item') Moving Down to the Model Layer www.it-ebooks.info | 333 @property def name(self): return self.item_set.first().text lists/models.py (ch18l025). The @property Decorator in Python If you haven’t seen it before, the @property decorator transforms a method on a class to make it appear to the outside world like an attribute. This is a powerful feature of the language, because it makes it easy to implement “duck typing”, to change the implementation of a property without changing the interface of the class. In other words, if we decide to change .name into being a “real” attribute on the model, which is stored as text in the database, then we will be able to do so entirely transparently—as far as the rest of our code is concerned, they will still be able to just access .name and get the list name, without needing to know about the implementation.


Django Book by Matt Behrens

Benevolent Dictator For Life (BDFL), book value, business logic, create, read, update, delete, database schema, distributed revision control, don't repeat yourself, duck typing, en.wikipedia.org, Firefox, full text search, loose coupling, MITM: man-in-the-middle, MVC pattern, revision control, Ruby on Rails, school choice, slashdot, SQL injection, web application

From now on, anytime we need a view that lists a set of objects, we can simply reuse this object_list view rather than writing view code. Here are a couple of notes about what we did: We’re passing the model classes directly, as the model parameter. The dictionary of extra URLconf options can pass any type of Python object – not just strings. The model.objects.all() line is an example of duck typing: “If it walks like a duck and talks like a duck, we can treat it like a duck.” Note the code doesn’t know what type of object model is; the only requirement is that model have an objects attribute, which in turn has an all() method. We’re using model.__name__.lower() in determining the template name.


pages: 1,331 words: 163,200

Hands-On Machine Learning With Scikit-Learn and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron

AlphaGo, Amazon Mechanical Turk, Anton Chekhov, backpropagation, combinatorial explosion, computer vision, constrained optimization, correlation coefficient, crowdsourcing, data science, deep learning, DeepMind, don't repeat yourself, duck typing, Elon Musk, en.wikipedia.org, friendly AI, Geoffrey Hinton, ImageNet competition, information retrieval, iterative process, John von Neumann, Kickstarter, machine translation, natural language processing, Netflix Prize, NP-complete, OpenAI, optical character recognition, P = NP, p-value, pattern recognition, pull request, recommendation engine, self-driving car, sentiment analysis, SpamAssassin, speech recognition, stochastic process

Custom Transformers Although Scikit-Learn provides many useful transformers, you will need to write your own for tasks such as custom cleanup operations or combining specific attributes. You will want your transformer to work seamlessly with Scikit-Learn functionalities (such as pipelines), and since Scikit-Learn relies on duck typing (not inheritance), all you need is to create a class and implement three methods: fit() (returning self), transform(), and fit_transform(). You can get the last one for free by simply adding TransformerMixin as a base class. Also, if you add BaseEstimator as a base class (and avoid *args and **kargs in your constructor) you will get two extra methods (get_params() and set_params()) that will be useful for automatic hyperparameter tuning.


pages: 999 words: 194,942

Clojure Programming by Chas Emerick, Brian Carper, Christophe Grand

Amazon Web Services, Benoit Mandelbrot, cloud computing, cognitive load, continuous integration, database schema, domain-specific language, don't repeat yourself, drop ship, duck typing, en.wikipedia.org, failed state, finite state, Firefox, functional programming, game design, general-purpose programming language, Guido van Rossum, higher-order functions, Larry Wall, mandelbrot fractal, no silver bullet, Paul Graham, platform as a service, premature optimization, random walk, Ruby on Rails, Schrödinger's Cat, semantic web, software as a service, sorting algorithm, SQL injection, Turing complete, type inference, web application

[288] Any object whose class provides a nullary close method will work here. This matches the close method defined by the java.lang.Closeable interface, but because Clojure is a dynamic language, the resources you use in with-open do not need to actually implement that interface. This is called “duck typing” in many other dynamic languages. [289] http://clojure.github.com/clojure/clojure.java.io-api.html. Type Hinting for Performance You may have noticed some code examples that use syntax referring to Java class names, such as ^String here: (defn length-of [^String text] (.length text)) The ^ClassName syntax defines a type hint, an explicit indication to the Clojure compiler of the object type of an expression, var value, or a named binding.