dass

package module

v0.0.6 Latest Latest Go to latest Published: Feb 5, 2026 License: Apache-2.0 Imports: 14 Imported by: 2

Details

Valid go.mod file
Redistributable license
Tagged version
Stable version
Learn more about best practices

Repository

github.com/invertedv/dass

Links

Open Source Insights

README ¶

Dass - Data Assembly

Why

If you have a process that regularly pulls data, you may find the SQL is hard to create and maintain. There are lots of fields calculated. There are multiple joins. It's a mess. Want to add something new? Expect days of tweaking and testing.

The idea of Dass is to move data manipulation away from SQL and into Go where it can be more easily created and maintained.

Concept

Dass is a container for data. It is similar to a dataframe construct in that it stores the data in memory. Unlike most dataframes, the data is stored by row.

Each row is a map which simplifies the code. The value of the map is type any. This does add overhead, but is very flexible. Individual elements can be any type, including slices and maps.

Dass is lightweight in the sense that it supplies a small, core set of methods. These methods mostly support data construction and access.

Key, though, is that it supports applying user functions to the rows via the Apply method. You can move those calculations out of SQL and even joins through these functions.

Dass can read data from SQL, or CSV and XLSX files either on the web or locally.

Documentation ¶

Overview ¶

Dass - Data Assembly ¶

Why Dass ¶

If you have a process that regularly pulls data, you may find the SQL is hard to create and maintain. There are lots of fields calculated. There are multiple joins. It's a mess. Want to add something new? Expect days of tweaking and testing.

The idea of Dass is to move data manipulation away from SQL and into Go where it can be more easily created and maintained.

Concept ¶

Dass is a container for data. It is similar to a dataframe construct in that it stores the data in memory. Unlike most dataframes, the data is stored by row.

Each row is a map which simplifies the code. The value of the map is type *any*. This does add overhead, but is very flexible. Individual elements can be any type, including slices and maps.

Dass is lightweight in the sense that it supplies a small, core set of methods. These methods mostly support data construction and access.

Key, though, is that it supports applying user functions to the rows via the Apply method. You can move those calculations out of SQL and even joins through these functions.

Dass can read data from SQL, or CSV and XLSX files either on the web or locally.

Index ¶

func FetchCSV(source string) ([][]string, error)
func FetchXLSX(source string) ([][]string, error)
func Save(data, localFile string) error
func WebFetch(url string) (string, error)
type FnSpec
- func NewFnSpec(fnName string, rawFn any, args []string, assignTo string) (*FnSpec, error)
type Row
- func ParseRow(inRow, names, template, miss []string) (Row, string)
type RowFn
type Rows

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func FetchCSV ¶ added in v0.0.3

func FetchCSV(source string) ([][]string, error)

FetchCSV returns the contents of source CSV

- source -- either a web address or local file

func FetchXLSX ¶ added in v0.0.3

func FetchXLSX(source string) ([][]string, error)

FetchXLSX returns the contents of source XLSX

- source -- either a web address or local file

func Save ¶ added in v0.0.3

func Save(data, localFile string) error

Save saves the string returned by WebFetch. Works for both CSV and XLSX.

func WebFetch ¶ added in v0.0.3

func WebFetch(url string) (string, error)

WebFetch returns the contents of the page specified by url.

Types ¶

type FnSpec ¶

type FnSpec struct {
	// contains filtered or unexported fields
}

FnSpec manages converting an arbitrary user-supplied function to a RowFn.

func NewFnSpec ¶

func NewFnSpec(fnName string, rawFn any, args []string, assignTo string) (*FnSpec, error)

NewFnSpec creates a new FnSpec. Note that it cannot be applied to a Row until Make is run.

fnName - function name

rawFn - the raw function which is converted to a RowFn and then applied to a Row

args - the key values in Row that are the arguments to rawFn

assignTo - the key in row to assign the output to

func (*FnSpec) F ¶

func (fs *FnSpec) F() RowFn

func (*FnSpec) Make ¶

func (fs *FnSpec) Make(r Row) error

Make creates a RowFn from the user's raw function.

r - an example row that will serve as the argument to the created RowFn.

func (*FnSpec) Reset ¶

func (fs *FnSpec) Reset()

Reset resets the RowFn to nil.

func (*FnSpec) Return1 ¶

func (fs *FnSpec) Return1() reflect.Kind

Return1 is the primary return type of RowFn

func (*FnSpec) Return2 ¶

func (fs *FnSpec) Return2() reflect.Kind

Return2 is the secondary return type. If the return is a slice, Return1 is reflect.Slice and Return2 is the type of the slice elements. If the return is simple (e.g. Return1 = reflect.Int), Return2 has value reflect.Invalid.

func (*FnSpec) UpdateArgs ¶

func (fs *FnSpec) UpdateArgs(newArgs []string)

func (*FnSpec) UpdateAssignTo ¶

func (fs *FnSpec) UpdateAssignTo(newAT string)

type Row ¶

type Row map[string]any

Row is the base type: a single row of the data

func ParseRow ¶ added in v0.0.3

func ParseRow(inRow, names, template, miss []string) (Row, string)

ParseRow will parse a row of []string into a Row.

inRow - slice to parse template - expected type of each element: float, int, date, CCYYMMDD or string miss - when a missing value is encountered:

skip - skip row
fail - throw an error
return - return anyway with the value substituted

missing elements are set to math.Maxfloat64, math.MaxInt, 1/1/1900.

func (Row) Apply ¶

func (r Row) Apply(fs *FnSpec) error

Apply applies the function fs and adds the result to r.

func (Row) Equal ¶

func (r Row) Equal(r2 Row) bool

func (Row) Names ¶

func (r Row) Names() []string

func (Row) String ¶

func (r Row) String() string

type RowFn ¶

type RowFn func(r Row) any

RowFn is a function that can be applied to a row. The user supplies aribitrary function that is converted to a RowFn via FnSpec.Make. The function can return most data types including slices.

type Rows ¶

type Rows struct {
	// contains filtered or unexported fields
}

Rows collect a slice of row. Each row must have the same set of keys, though the underlying data can differ.

func NewRows ¶

func NewRows(r Row) *Rows

func ParseRows ¶ added in v0.0.3

func ParseRows(rows [][]string, names, template, miss []string, skip int) (*Rows, error)

ParseRows will parse rows of []string

rows - data to parse template - expected type of each element: float, int, date, CCYYMMDD or string miss - when a missing value is encountered:

	skip - skip row
	fail - throw an error
	return - return anyway with the value substituted

      missing elements are set to math.Maxfloat64, math.MaxInt, 1/1/1900.

skip - number of rows to skip before parsing

func ReadCSV ¶

func ReadCSV(r io.ReadCloser) (rows *Rows, e error)

func ReadSQL ¶

func ReadSQL(r *sql.Rows) (rows *Rows, e error)

ReadSQL creates a *Rows object from r.

Example ¶

This example shows out to load data from a query

table := os.Getenv("table")
qry := fmt.Sprintf("SELECT distinct state from %s ORDER BY state", table)

db := newConnectCH()
defer db.Close()

r, _ := db.Query(qry)

// load the data into rows
rows, _ := ReadSQL(r)

fmt.Println(rows)

Output:

state
"AK"
"AL"
"AR"
"AZ"
"CA"

Example (Slice) ¶

This example shows that you can return a slice from the query

table := os.Getenv("table")

db := newConnectCH()
defer db.Close()

qry := fmt.Sprintf("SELECT lnId, monthly.upb FROM %s WHERE length(monthly.upb) == 3  ORDER BY lnId limit 10", table)

r, _ := db.Query(qry)
rows, _ := ReadSQL(r)

fmt.Println(rows)

Output:

lnId,monthly.upb
"000097473420",[304379.06 303970.03 0]
"000097473926",[427500 426962.12 0]
"000097474438",[206160.44 205878.55 0]
"000097474496",[165000 164782.7 0]
"000097474584",[386250 385729.72 0]

func (*Rows) AddColumn ¶

func (rs *Rows) AddColumn(name string, data []any) error

func (*Rows) Append ¶

func (rs *Rows) Append(r Row) error

Append appends a row

func (*Rows) Apply ¶

func (rs *Rows) Apply(fs *FnSpec) error

Apply applies fs to all rows

Example (Map) ¶

This example shows how you can replace a SQL join with a RowFn. First, the data that would be the smaller table in the join is created. Then a function is built to apply that data to a Row. We then pull the larger table and use Apply to add the new element to each row.

const n = 100

table := os.Getenv("table")
zipQry := fmt.Sprintf("SELECT distinct zip3 from %s ORDER BY zip3", table)

db := newConnectCH()
defer db.Close()

// pull the unique zip3 (3-digit zip codes)
r, _ := db.Query(zipQry)
rows, _ := ReadSQL(r)
zipMap := make(map[string]int)

// build a simple map of zip3 to int.  We'll apply this to the data adding the "int" value to it.
for j, zip := range rows.Iter() {
	zipMap[zip["zip3"].(string)] = j
}

qry := fmt.Sprintf("SELECT lnId, fico, cltv, numBorr, purpose, propType, units, occ, state, msa, zip3 FROM %s ORDER BY lnId limit %v", table, n)
r, _ = db.Query(qry)

rows, _ = ReadSQL(r)

// rf applys the map to its input
rf := func(zip3 string) int {
	if v, ok := zipMap[zip3]; ok {
		return v
	}

	return -1
}

fs, _ := NewFnSpec("zipInt", rf, []string{"zip3"}, "zipInt")

_ = rows.Apply(fs)

fmt.Println(rows)

Output:

cltv,fico,lnId,msa,numBorr,occ,propType,purpose,state,units,zip3,zipInt
82,765,"000097473062","47900",2,"P","PU","P","MD",1,"207",194
95,766,"000097473063","42660",2,"P","SF","P","WA",1,"980",914
95,769,"000097473064","40140",1,"P","PU","P","CA",1,"923",858
95,687,"000097473065","35380",1,"P","SF","P","LA",1,"704",662
85,779,"000097473066","19820",1,"P","SF","R","MI",1,"480",459

Example (Simple) ¶

Using Apply to run a function over *Rows

table := os.Getenv("table")
qry := fmt.Sprintf("SELECT lnId, state, ltv, cltv FROM %s WHERE cltv < ltv ORDER BY lnId LIMIT 10", table)

db := newConnectCH()
defer db.Close()

r, _ := db.Query(qry)
rows, _ := ReadSQL(r)

fixCLTV := func(ltv, cltv float64) float64 {
	if cltv < ltv {
		return ltv
	}

	return cltv
}

fs, _ := NewFnSpec("cltvFixed", fixCLTV, []string{"ltv", "cltv"}, "cltvFixed")

_ = rows.Apply(fs)

fmt.Println(rows)

Output:

cltv,cltvFixed,lnId,ltv,state
-1,88,"000106024156",88,"NC"
-1,27,"000106024193",27,"CA"
-1,84,"000106024226",84,"CA"
-1,100,"000106024234",100,"ME"
-1,90,"000106024278",90,"GA"

Example (Slice) ¶

Using Apply to run a function that returns a slice

table := os.Getenv("table")
qry := fmt.Sprintf("SELECT lnId, state, ltv, cltv FROM %s ORDER BY lnId LIMIT 10", table)

db := newConnectCH()
defer db.Close()

r, _ := db.Query(qry)
rows, _ := ReadSQL(r)

fn := func(ltv, cltv float64) []float64 {
	if cltv < ltv {
		cltv = ltv
	}

	sltv := cltv - ltv
	return []float64{ltv, sltv, cltv}
}

fs, _ := NewFnSpec("ltvs", fn, []string{"ltv", "cltv"}, "ltvs")

_ = rows.Apply(fs)

fmt.Println(rows)

Output:

cltv,lnId,ltv,ltvs,state
82,"000097473062",82,[82 0 82],"MD"
95,"000097473063",95,[95 0 95],"WA"
95,"000097473064",95,[95 0 95],"CA"
95,"000097473065",95,[95 0 95],"LA"
85,"000097473066",85,[85 0 85],"MI"

func (*Rows) Column ¶

func (rs *Rows) Column(name string) ([]any, error)

Column returns one element from all the rows as a slice

name - name of the element to return

func (*Rows) DropColumn ¶

func (rs *Rows) DropColumn(name string) error

func (*Rows) Equal ¶

func (rs *Rows) Equal(rs2 *Rows) bool

Equal returns true if rs and rs2 have equal values throughout

func (*Rows) Iter ¶

func (rs *Rows) Iter() iter.Seq2[int, Row]

Iter ranges over the rows

func (*Rows) Names ¶

func (rs *Rows) Names() []string

Names returns the element names

func (*Rows) Row ¶

func (rs *Rows) Row(index int) Row

Row returns the row at row number index.

func (*Rows) RowCount ¶

func (rs *Rows) RowCount() int

func (*Rows) String ¶

func (rs *Rows) String() string

func (*Rows) Write ¶

func (rs *Rows) Write(w io.WriteCloser) error

Write writes all rows as comma-separated values, including a header, to w, closing when done.

Example ¶

This example shows how to create a CSV from a *Rows object.

table := os.Getenv("table")
qry := fmt.Sprintf("SELECT state, count(*) As n FROM %s GROUP BY state ORDER BY n DESC", table)

db := newConnectCH()
defer db.Close()

// run query
r, _ := db.Query(qry)

// load data
rows, _ := ReadSQL(r)

fmt.Println(rows)
outFile := os.TempDir() + "/states.csv"

f, _ := os.Create(outFile)
defer f.Close()

rows.Write(f)

Output:

state,n
"CA",10968569
"FL",4679574
"TX",4544139
"IL",3328800
"MI",2683424

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL