The tsbox package is built around a set of functions that convert time series of different classes to each other. They are frequency-agnostic, and allow the user to combine multiple non-standard and irregular frequencies. Because coercion works reliably, it is easy to write functions that work identically for all classes. So whether we want to smooth, scale, differentiate, chain-link, forecast, regularize or seasonally adjust a time series, we can use the same tsbox-command for any time series class.

This blog gives a short overview of the changes introduced in 0.1. A detailed overview of the package functionality is given in the documentation page (or in a previous blog-post).

Version 0.1, now on CRAN, brings a large number of bug fixes and improvements. A substantial change involves the treatment of `NA`

values in data frames. Previously, all `NA`

s in data frames were treated as implicit, and were only made explicit by a call to `ts_regular`

.

This has changed now. If you convert a `ts`

object to a data frame, all `NA`

values will be preserved. To replicate previous behavior, apply the `ts_na_omit`

function:

```
library(tsbox)
x.ts <- ts_c(mdeaths, austres)
x.ts
ts_df(x.ts)
ts_na_omit(ts_df(x.ts))
```

`ts_span`

extends outside of series spanThis lays the groundwork for `ts_span`

to be extensible. With `extend = TRUE`

, `ts_span`

extends a regular series with `NA`

values, up to the specified limits, similar to base `window`

. Like all functions in tsbox, this is frequency-agnostic. For example, in the following, the monthly series `mdeaths`

is extended by monthly `NA`

values, while the quarterly series `austres`

is extended by quarterly `NA`

values.

```
x.df <- ts_df(ts_c(mdeaths, austres))
ts_span(x.df, end = "1999-12-01", extend = TRUE)
```

`ts_default`

standardizes column names in a data frameIn rectangular data structures, i.e., in a `data.frame`

, a `data.table`

, or a `tibble`

, tsbox stores one or multiple time series in the ‘long’ format. By default, tsbox detects a *value*, a *time* and zero, one or several *id *columns. Alternatively, the time column and the value column can be explicitly named `time`

and `value`

. If explicit names are used, the column order will be ignored.

While automatic column name detection is useful in interactive mode, it produces unnecessary overhead in longer workflows. The helper function `ts_default`

detects and renames the time and the value column, so that auto-detection will be turned off in subsequent steps (note that the names of the id columns are not affected):

```
x.df <- ts_df(ts_c(mdeaths, austres))
names(x.df) <- c("a fancy id name", "date", "count")
ts_plot(x.df) # tsbox is fine with that
ts_default(x.df)
```

`ts_summary`

summarizes time series`ts_summary`

provides a frequency agnostic summary of a ts-boxable object:

```
ts_summary(ts_c(mdeaths, austres))
#> id obs diff freq start end
#> 1 mdeaths 72 1 month 12 1974-01-01 1979-12-01
#> 2 austres 89 3 month 4 1971-04-01 1993-04-01
```

`ts_summary`

returns a plain data frame that can be used for any purpose. It is also recommended for the extraction of various time series properties, such as `start`

, `freq`

or `id`

:

```
ts_summary(austres)$id
#> [1] "austres"
ts_summary(austres)$start
#> [1] "1971-04-01"
```

Finally, we fabricated a tsbox cheat sheet that summarizes most functionality. Print and enjoy working with time series.

]]>The R ecosystem knows a ridiculous number of time series classes. So, I decided to create a new universal standard that finally covers everyone’s use case… Ok, just kidding!

tsbox, now freshly on CRAN, provides a set of tools that are agnostic towards existing time series classes. It is built around a set of converters, which convert time series stored as ts, xts, data.frame, data.table, tibble, zoo, tsibble or timeSeries to each other.

To install the stable version from CRAN:

install.packages("tsbox")

To get an idea how easy it is to switch from one class to another, consider this:

library(tsbox) x.ts <- ts_c(mdeaths, fdeaths) x.xts <- ts_xts(x.ts) x.df <- ts_df(x.xts) x.tbl <- ts_tbl(x.df) x.dt <- ts_tbl(x.tbl) x.zoo <- ts_zoo(x.dt) x.tsibble <- ts_tsibble(x.zoo) x.timeSeries <- ts_timeSeries(x.tsibble)

We jump form good old `ts`

objects to`xts`

, store our time series in various data frames and convert them to some highly specialized time series formats.

Because these converters work nicely, we can use them to make functions class-agnostic. If a class-agnostic function works for one class, it works for all:

ts_scale(x.ts) ts_scale(x.xts) ts_scale(x.df) ts_scale(x.dt) ts_scale(x.tbl)

`ts_scale`

normalizes one or multiple series, by subtracting the mean and dividing by the standard deviation. It works like a ‘generic’ function: You can apply it on any time series object, and it will return an object of the same class as its input.

So, whether we want to smooth, scale, differentiate, chain-link, forecast, regularize or seasonally adjust a series, we can use the same commands to whatever time series at hand. tsbox offers a comprehensive toolkit for the basics of time series manipulation. Here are some additional operations:

ts_pc(x.ts) # percentage change rates ts_forecast(x.xts) # forecast, by exponential smoothing ts_seas(x.df) # seasonal adjustment, by X-13 ts_frequency(x.dt, "year") # convert to annual frequency ts_span(x.tbl, "-1 year") # limit time span to final year

There are many more. Because they all start with `ts_`

, you can use auto-complete to see what’s around. Most conveniently, there is a time series plot function that works for all classes and frequencies:

ts_plot( `Airline Passengers` = AirPassengers, `Lynx trappings` = ts_df(lynx), `Deaths from Lung Diseases` = ts_xts(fdeaths), title = "Airlines, trappings, and deaths", subtitle = "Monthly passengers, annual trappings, monthly deaths" )

There is also a version that uses ggplot2 and has the same syntax.

You may have wondered why we treated data frames as a time series class. The spread of dplyr and data.table has given data frames a boost and made them one of the most popular data structures in R. So, storing time series in a data frame is an obvious consequence. And even if you don’t intend to keep time series in data frames, this is still the format in which you import and export your data. tsbox makes it easy to switch from data frames to time series and back.

tsbox includes tools to make existing functions class-agnostic. To do so, the `ts_`

function can be used to wrap any function that works with time series. For a function that works on `"ts"`

objects, this is as simple as that:

ts_rowsums <- ts_(rowSums) ts_rowsums(ts_c(mdeaths, fdeaths))

Note that `ts_`

returns a function, which can be used with or without a name.

In case you are wondering, tsbox uses data.table as a backend, and makes use of its incredibly efficient reshaping facilities, its joins and rolling joins. And thanks to anytime, tsbox will be able to recongnize almost any date format without manual intervention.

So, enjoy some relieve in R’s time series class struggle.

Website: www.tsbox.help

Similarly to the United States, public data in Switzerland is produced by a large number of different offices, which makes it hard to find any particular series. dataseries.org provides a structured and automatically updated collection of most of these series. We are still working on the data input, but are pretty much complete in the field of Economics.

You can download the data as spreadsheets or graphs, or embed interactive widgets in your website. Alternatively, you can import the data directly into R, using the dataseries package. Install the package from CRAN,

`install.packages("dataseries")`

and run the `ds`

function with the `id`

argument that you find on the website:

```
plot(dataseries::ds("GDP.PBRTT.A.R", "ts"),
ylab = "mio CHF, at 2010 prices, s. adj.",
main = "Gross Domestic Product")
```

This will give you an R plot of Switzerland’s GDP. (The data is cached, so calling the function again will not re-download until you restart the R session.)

In the following, I will use data from dataseries.org to produce a live forecast of Switzerland’s GDP. Each day the model is run, it will be ensured that the latest data is used. That way it is possible to produce a transparent and up-to-date forecast. For the following exercise, I will only use tools from R base, but it is of course possible to use the same data in a more advanced modeling framework.

In order to produce a reasonable forecast, we want to track early information on the business cycle, which is mostly survey data. We will use a question from the SECO Consumer Confidence Survey on current economic performance, the Credit Suisse / Procure Purchasing Managers’ Index and the ETHZ KOF Barometer.

Getting these indicators from dataseries.org directly into R is easy. Because these data are measured at different frequencies, we need to convert them to the same quarterly frequency as GDP. There are many packages that offer functions for that (e.g., the tempdisagg package has functions to move both to higher or lower frequencies), but I will stick to basic R here:

# Aggregating months to quarters (post updated on May 6, 2017) to_quarterly <- function(x){ aggregate(x, nfrequency = 4, FUN = mean) } pmi <- to_quarterly(dataseries::ds("PMI.SA.PM", "ts")) kof <- to_quarterly(dataseries::ds("KOF.KFBR", "ts")) csent <- dataseries::ds("CCI.GEPC", "ts")

A plot of these series shows the common trend in these variables and gives you an indication of the business cycle, which may have turned upward in recent months.

`plot(cbind(pmi, kof, csent), main = "Business Cycle Indicators")`

Since these series are stationary, our left hand side variable should be stationary as well. This is accomplished by calculating percentage change rates of GDP:

gdp.level <- dataseries::ds("GDP.PBRTT.A.R", "ts") gdp <- (gdp.level / lag(gdp.level, -1)) - 1

R’s workhorse for time series modeling is the `arima`

function, which allows you to construct a univariate or multivariate model of GDP growth. Since the data is seasonally adjusted, a simple autoregressive process (AR1) offers a good benchmark:

```
# AR1
m0 <- arima(gdp, order = c(1, 0, 0))
fct0 <- predict(m0, n.ahead = 1)$pred
# GDP Growth Q1: +0.3
```

If you need advice on which ARIMA model to choose, the information criterions, accessed by the R functions `AIC`

or `BIC`

, can help you to choose a model. Simply take the model with lowest information criterion. The `auto.arima`

function from the forecast package also allows you to do the selection automatically.

We can include our series individually or jointly and estimate a range of different models. A good model (in terms of the AIC information criterion) is the following, which uses PMI and KOF data (but not consumer sentiment data):

# PMI, KOF dta <- window(cbind(pmi, kof), start = start(pmi), end = end(pmi)) m1 <- arima(window(gdp, start = start(dta)), xreg = window(dta, end = end(gdp))) fct1 <- predict(m1, n.ahead = 1, newxreg = window(dta, start = tsp(gdp)[2] + 0.25, end = tsp(gdp)[2] + 0.25) )$pred # GDP Growth Q1: +0.7

The model’s forecast for the first quarter of 2017 is 0.7 – a value that hasn’t been reached for more than two years.

If you have multiple indicators at hand, a common problem is multicollinearity, the fact that indicators are correlated, and therefore too many indicators deteriorate the quality of the model estimation.

An easy fix is to use a factor model, where the indicators are summarized in a few factors, which can be calculated by principal components (see Stock and Watson 2002):

# PMI, KOF, Consumer Sentiment, first Principal Component pca <- prcomp(window(cbind(pmi, kof, csent), start = start(pmi), end = tsp(gdp)[2] + 0.25), scale. = TRUE) dta.pca <- ts(pca$x[, 'PC1'], start = start(pmi), frequency = 4) m2 <- arima(window(gdp, start = start(dta)), xreg = window(dta.pca, end = end(gdp))) fct2 <- predict(m2, n.ahead = 1, newxreg = window(dta.pca, start = tsp(gdp)[2] + 0.25) )$pred # GDP Growth Q1: +0.7

Again, we get a forecast value of 0.7. Overall, survey data indicates that the economy is well on track. Let’s do a graphical comparison of our forecasts:

# skeletons to include forecasts gdp.fct0 <- window(gdp, extend = TRUE, end = tsp(gdp)[2] + 0.25) gdp.fct1 <- gdp.fct2 <- gdp.fct0 # plug forecasts into skeletons window(gdp.fct0, start = end(gdp.fct0)) <- fct0 window(gdp.fct1, start = end(gdp.fct1)) <- fct1 window(gdp.fct2, start = end(gdp.fct2)) <- fct2 ts.plot(window(cbind(gdp, gdp.fct0, gdp.fct1, gdp.fct2), start = 2010), col = 1:4, ylab = "quarterly growth rates, s. adj.", main = "GDP Forecasts") legend("topright", legend = c("GDP Growth Rate", "AR 1 Forecast", "PMI, KOF", "Principal Component"), lty = 1, col = 1:4, bty = "n")

Publication of first quarter GDP is on June 1, 2017. See you in a month!

```
install.packages("seasonal")
```

The latest version 1.3 comes with a new `udg`

function and a customizable `summary`

method, which give power users of X-13 a convenient way to check the statistics that are of *their* interest. For a full list of changes, see the package NEWS.

Version 1.3 offers a generalized way to access diagnostic statistics. In seasonal, it was always possible to use *all options* of X-13 and access *all output series*. Now it is easy to access *all diagnostics* as well.

The main new function is `udg`

, named after the X-13 output file which it is reading. Consider a simple call to `seas`

(the main function of the seasonal package) that uses the X-11 seasonal adjustment method:

```
m <- seas(AirPassengers, x11 = "")
udg(m)
```

The `udg`

function returns a list containing 357 named diagnostics. They are properly type-converted, so they can be directly used for further analysis within R.

If we ask for a specific statistic, such as the popular X-11 *M statistics*, the result will be simplified to a numeric vector (see `?udg`

for additional options):

```
udg(m, c("f3.m01", "f3.m02", "f3.m03", "f3.m04"))
## f3.m01 f3.m02 f3.m03 f3.m04
## 0.041 0.042 0.000 0.283
```

There are also some new wrappers for commonly used statistics, such as `AIC`

, `BIC`

, `logLik`

or `qs`

, which use the `udg`

function.

The new functionality paves the way for a customizable `summary`

method for `seas`

objects. For example, if we want to add the *M statistics* for X-11 adjustments to the summary, we can write:

```
summary(m, stats = c("f3.m01", "f3.m02", "f3.m03", "f3.m04"))
## Call:
## seas(x = AirPassengers, x11 = "")
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## Weekday -0.0029497 0.0005232 -5.638 1.72e-08 ***
## Easter[1] 0.0177674 0.0071580 2.482 0.0131 *
## AO1951.May 0.1001558 0.0204387 4.900 9.57e-07 ***
## MA-Nonseasonal-01 0.1156204 0.0858588 1.347 0.1781
## MA-Seasonal-12 0.4973600 0.0774677 6.420 1.36e-10 ***
## ---
## Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
##
## X11 adj. ARIMA: (0 1 1)(0 1 1) Obs.: 144 Transform: log
## AICc: 947.3, BIC: 963.9 QS (no seasonality in final): 0
## Box-Ljung (no autocorr.): 26.65 Shapiro (normality): 0.9908
## f3.m01: 0.041 f3.m02: 0.042 f3.m03: 0 f3.m04: 0.283
```

Note the new line at the end, which shows the *M statistics*.

If we want to routinely consider these statistics in our `summary`

, we can set the `seas.stats`

option accordingly:

```
options(seas.stats = c("f3.m01", "f3.m02", "f3.m03", "f3.m04"))
```

This will change the default behavior, and

```
summary(m)
```

will return the same output as above. To restore the default behavior, set the option back to `NULL`

.

```
options(seas.stats = NULL)
```

The seasonal package by Christoph Sax brings a very featureful and expressive interface for working with seasonal data to the R environment. It uses the standard tool of the trade: X-13ARIMA-SEATS. This powerful program is provided by the statisticians of the US Census Bureau based on their earlier work (named X-11 and X-12-ARIMA) as well as the TRAMO/SEATS program by the Bank of Spain. X-13ARIMA-SEATS is probably the best known tool for de-seasonalization of timeseries, and used by statistical offices around the world.

Sadly, it also has a steep learning curve. One interacts with a basic command-line tool which users have to download, install and properly reference (by environment variables or related means). Each model specification has to be prepared in a special ‘spec’ file that uses its own, cumbersome syntax.

As seasonal provides all the required functionality to use X-13ARIMA-SEATS from R — see the very nice seasonal demo site — it still required the user to manually deal with the X-13ARIMA-SEATS installation.

So we decided to do something about this. A pair of GitHub repositories provide both the underlying binary in a per-operating system form (see x13prebuilt) as well as a ready-to- use R package (see x13binary) which uses the former to provide binaries for R. And the latter is now on CRAN as package x13binary ready to be used on Windows, OS-X or Linux. And the seasonal package (in version 1.2.0 – now on CRAN – or later) automatically makes use of it. Installing seasaonal *and* x13binary in R is now as easy as:

`install.packages("seasonal")`

which opens the door for effortless deployment of powerful deasonalization. By default, the principal function of the package employs a number of automated techniques that work well in most circumstances. For example, the following code produces a seasonal adjustment of the latest data of US retail sales (by the Census Bureau) downloaded from Quandl:

```
library(seasonal)
url <- "https://www.quandl.com/api/v3/datasets/USCENSUS/BI_MARTS_44000_SM.csv?order=asc"
rs <- ts(read.csv(url)$Value/1e3, start = c(1992, 1), frequency = 12)
m1 <- seas(rs)
plot(m1, main = "Retail Trade: U.S. Total Sales",
ylab = "USD (in Billions)")
```

This tests for log-transformation, performs an automated ARIMA model search, applies outlier detection, tests and adjusts for trading day and Easter effects, and invokes the SEATS method to perform seasonal adjustment. And this is how the adjusted series looks like:

Of course, you can access all available options of X-13ARIMA-SEATS as well. Here is an example where we adjust the latest data for Chinese exports (as tallied by the US FED), taking into account the different effects of Chinese New Year before, during and after the holiday:

```
url <- "https://www.quandl.com/api/v3/datasets/FRED/VALEXPCNM052N.csv?order=asc"
xp <- ts(read.csv(url)$VALUE/1e9, start = c(1981, 1), frequency = 12)
m2 <- seas(window(xp, start = 2000),
xreg = cbind(
genhol(cny, start = -7, end = -1, center = "calendar"),
genhol(cny, start = 0, end = 7, center = "calendar"),
genhol(cny, start = 8, end = 21, center = "calendar")),
regression.aictest = c("td", "user"),
regression.usertype = "holiday")
plot(m2, main = "Goods, Value of Exports for China",
ylab = "USD (in Billions)")
```

which generates the following chart demonstrating a recent flattening in export activity measured in USD.

We hope this simple examples illustrates both how powerful a tool `X-13ARIMA-SEATS`

is, but also just how easy it is to use X-13ARIMA-SEATS from R now that we provide the x13binary package automating its installation.

]]>

The latest version of www.seasonal.website comes with several new features:

The main new feature is a live parser of X-13 spc files. Changes in the *Options*, triggered by the pull-down menus, or changes in the *R Call*, are reflected in an updated *X-13 Call*. On the other hand, changes in the *X-13 Call* will be reflected in updates in the *Options* and the *R Call*.

This brings interesting new possibilities:

- Non-R-users may use the website to generate spc files, which they can use in any software that includes X-13ARIMA-SEATS.
- People familiar with X-13 may use the spc syntax to learn about the syntax of the R-package
*seasonal*. - People familiar with the R-package
*seasonal*may use it learn about the spc syntax.

The upload/download feature has been reworked. A button on the top-right corner opens a new upload and download dialog.

Both XLSX and CSV formats are supported. You can upload and adjust your own monthly or quarterly time series. All data will be permanently deleted after your session.

The summary, previously just the printed output of the R-function *summary*, has been overhauled. Colored flags indicate the significance level of the coefficients, reddish colors indicate warning signs from the tests.

The R package *seasonal* simplifies the task by providing an interface to X-13, the newest seasonal adjustment software by the US Census Bureau. It combines and extends the capabilities of the older X-12ARIMA and TRAMO-SEATS software packages. The most simple use of *seasonal *requires the application of the main function to a time series, which invokes automated procedures that work well in many circumstances:

seas(AirPassengers)

A new shiny based website is showcasing the use of *seasonal* and allows for online adjustment of time series, without the need to download and install *seasonal*. The *AirPassengers* series is set as the default series, but can be replaced by any uploaded series. There are other demo series that show the use of the software to adjust Indian Diwali or Chinese New Year effects.

The site allows to adjust most parameters of X-13, and to view and download a substantial part of its output. Frequently used options can be chosen from a drag and drop menu, while less often used options can be chosen by manipulating the R-Call itself.

Here are some of the most interesting features of the website:

Frequently used options of X-13 can be modified using the drop down selectors. Each change will result in a re-estimation of the seasonal adjustment model. The R-call, the output and the summary are updated accordingly.

A substantial part of the output of X-13ARIMA-SEATS can be shown on the website. Click and drag to zoom into the graph. Double click to restore the original view.

The R-Call to seasonal can be modified and run online. In the picture below, the ARIMA model has been adjusted to include an autoregressive parameter of order 2. Press the button to execute the modified call.

User defined series can be uploaded, importing from Excel or CSV. Also, all viewable series can be downloaded as Excel or CSV.

Chinese New Year or Indian Diwali support is included out of the box and can be selected from the drop down menu. Adjustment for these holidays is as easy as adjusting Easter effects.

The examples from the official manual of X-13 can be run directly on the website. The collection of examples can found here.

]]>Because of its importance, Chinese New Year seriously distorts monthly time series, which are usually reported according to the Gregorian calendar. Unlike Easter, Chinese New Year does not affect quarterly time series, as it always falls in the first quarter.

The standard software packages for seasonal adjustment, X-12-ARIMA and X-13-ARIMA-SEATS (developed by the U.S. Census Bureau) or Tramo Seats (developed by the Bank of Spain) have a built-in adjustment procedure for Easter holiday, but not for Chinese New Year. However, all packages allow for the inclusion of user defined variables, and the Chinese New Year can be modeled as such.

With the R package *seasonal*, generating and including such a series is easy. We will use it in the following to seasonally adjust and remove Chinese New Year effects from the nominal dollar value of imports to China. *seasonal* is an interface to X-13ARIMA-SEATS; for more information and installation details, see here.

Chinese imports are included as an example series in *seasonal*. As the series has a very different seasonal pattern before 2000, we focus on the later period. (Adjusting the whole series in one step is possible, but for good results one should manually model the seasonal break.)

```
library(seasonal)
data(cntrade) # contains imports ('imp') and exports ('exp') of China
imp <- window(imp, start = 2000) # this shortens the series
```

*seasonal* includes the `genhol()`

function, a R version of the equally named software utility by the U.S. Census Bureau. Using the dates of the Chinese New Year as an input, it produces a time series with the deviations from the monthly means. Here we are assuming that the holiday starts on New Year’s Eve and lasts for one week.

```
data(holiday) # dates of Chinese New Year and Easter, included in seasonal
cny.ts <- genhol(cny, start = -1, end = 6, center = "calendar")
```

In 2014, only two days in January were affected by the holiday (New Year’s Eve and New Year’s Day). 75% of the holiday fell into February. Thus, January was affected slightly less than average, February slightly more. This is very different from 2012, when the holiday completely fell into January.

```
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2011 -0.26 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2012 0.74 -0.74 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2013 -0.26 0.26 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
2014 -0.01 0.01 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
```

The time series `cny.ts`

can be included in the main seasonal adjustment. The automated procedures of X-13ARIMA-SEATS can be applied to the `imp`

series in the following way:

`m1 <- seas(imp, xreg = cny.ts, regression.usertype = "holiday", x11 = list()) summary(m1)`

`## ## Call: ## seas(x = imp, xreg = cny.ts, regression.usertype = "holiday", ## x11 = list()) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## cny.ts -0.18104 0.01548 -11.70 < 2e-16 *** ## Weekday 0.00514 0.00104 4.94 7.8e-07 *** ## LS2008.Nov -0.37584 0.04745 -7.92 2.3e-15 *** ## MA-Nonseasonal-01 0.39776 0.07202 5.52 3.3e-08 *** ## MA-Seasonal-12 0.72749 0.06428 11.32 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## ARIMA structure: (0 1 1)(0 1 1) Number of obs.: 169 Transform: log ## AICc: 1.6e+03, BIC: 1.62e+03 QS seas. test (adj. series): 0 ## Box-Ljung (no autocorr.): 33.6 . Shapiro (normality): 0.978 **`

With `xreg`

, arbitrary user defined regressors can be included, `regression.usertype = "holiday"`

ensures that the final series does not include the regression effect. We also have chosen X11 as the decomposition method.

Unsurprisingly, the summary reveals a highly significant Chinese New Year effect. As the automatic model has been estimated on the logarithmic series, the coefficient of -0.18 indicates that New Year in 2012 has lowered imports by approximately 0.74 * 18 = 13%. The automatic procedure has also detected weekday effects and a level shift during the financial crisis.

We can do even better by using more than one user defined regressors, one for the pre-New-Year period and one for the post-New-Year period (thanks, Freya Beamish):

`pre_cny <- genhol(cny, start = -6, end = -1, frequency = 12, center = "calendar") post_cny <- genhol(cny, start = 0, end = 6, frequency = 12, center = "calendar") m2 <- seas(x = imp, xreg = cbind(pre_cny, post_cny), regression.usertype = "holiday", x11 = list()) summary(m2)`

`## ## Call: ## seas(x = imp, xreg = cbind(pre_cny, post_cny), regression.usertype = "holiday", ## x11 = list()) ## ## Coefficients: ## Estimate Std. Error z value Pr(>|z|) ## pre_cny 0.070843 0.019199 3.69 0.00022 *** ## post_cny -0.241043 0.020816 -11.58 < 2e-16 *** ## Weekday 0.005233 0.000943 5.55 2.9e-08 *** ## LS2008.Nov -0.357887 0.045790 -7.82 5.5e-15 *** ## MA-Nonseasonal-01 0.331626 0.073967 4.48 7.3e-06 *** ## MA-Seasonal-12 0.687479 0.065740 10.46 < 2e-16 *** ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ## ## ARIMA structure: (0 1 1)(0 1 1) Number of obs.: 169 Transform: log ## AICc: 1.59e+03, BIC: 1.61e+03 QS seas. test (adj. series):0.75 ## Box-Ljung (no autocorr.): 37.6 * Shapiro (normality): 0.984 *`

`plot(m2)`

There are actually two kind of New Year effects: Until New Year’s Eve, import activity is higher than usual. During the holiday, it is lower. By including two regressors, these opposite effects can be modeled. Note that the negative effect is more pronounced than the positive one.

The model could be further refined. With the `static()`

function, a non-automatic version of the previous call can be extracted. It can be copy-pasted and used for further manipulations.

`static(m2)`

`## seas(x = imp, xreg = cbind(pre_cny, post_cny), regression.usertype = "holiday", ## x11 = list(), regression.variables = c("td1coef", "ls2008.Nov" ## ), arima.model = "(0 1 1)(0 1 1)", regression.aictest = NULL, ## outlier = NULL, transform.function = "log")`

The `inspect()`

function opens an interactive window that allows for the manipulation of a number of arguments. With each change, the adjustment process and the visualizations are recalculated. (This only works with R Studio.)

```
inspect(m)
```

After some playing around, we would probably stay with the two regressor adjustment model from above:

```
m2 <- seas(x = imp, xreg = cbind(pre_cny, post_cny), regression.usertype = "holiday",
x11 = list(), regression.variables = c("td1coef", "ls2008.Nov"),
arima.model = "(0 1 1)(0 1 1)", regression.aictest = NULL,
outlier = NULL, transform.function = "log")
```

It’s far form perfect. Normality statistics are bad, and there may be some traces of autocorrelation. On the other hand, the seasonal component is stable and revisions are small.

Was it worth the pain? The following graph shows the same seasonal adjustment with and without the Chinese New Year adjustment:

```
m3 <- seas(x = imp, x11 = list(), regression.variables = c("td1coef", "ls2008.Nov"),
arima.model = "(0 1 1)(0 1 1)", regression.aictest = NULL, outlier = NULL,
transform.function = "log")
ts.plot(diff(log(cbind(final(m2), final(m3)))), col = c("red", "blue"),
lwd = c(2, 1))
```

In 2012, we would have concluded that imports have plumped in January, soared in February and plumped again in March (blue line). With the adjustment, we rightly conclude that there was no such craziness (red line).

```
ts.plot(final(m2), final(m1), col = c("red", "blue"), lwd = c(2, 1))
```

How useful is the two regressor model? Most of the time, the single regressor model performs reasonably well and leads to results similar to the two regressors model. This year, however, the Lunar New Year fell on January 31. As people were importing more in the pre-New-Year period, January imports were actually affected by a *positive* New Year effect. The right adjustment would be to correct the numbers downward! With the one regressor model, we would wrongly conclude that imports have soared (blue line). In fact, they have actually stagnated (red line).

First, get the data (in PWT 8, you need to calculate per capita measures yourself):

library(pwt8) pwt8.0$cap <- pwt8.0$rgdpe / pwt8.0$pop

Use ggplot2 to draw time series for a variable. Exchange the isocodes in the list or the variable name to alter the graph.

library(ggplot2) dta <- subset(pwt8.0, isocode %in% c("USA", "JPN", "KOR", "CHN")) qplot(dta$year, dta[, 'cap'], geom = "line", group = dta$isocode, color = as.factor(dta$isocode) ) + theme(legend.title = element_blank()) + scale_y_log10()

and that’s what you will get:

Use ggplot2 and manipulate to be even more flexible. (You need RStudio for that, but you should have it anyway!)

library(manipulate) dta <- subset(pwt8.0, isocode %in% c("USA", "JPN", "KOR", "CHN")) manipulate( qplot(dta$year, dta[, vars], geom = "line", group = dta$isocode, color = as.factor(dta$isocode) ) + theme(legend.title = element_blank()), vars = picker(as.list(colnames(pwt8.0)[-(1:3)])) )

you will get a nice menu to choose the variable of your interest.

googleVis let’s you use the Google Visualisation API.

library(googleVis)

First, get some pretty colors from colorbrewer2.org:

colorbrewer <- "{maxValue: 40000, colors:['#F7FCF0', '#E0F3DB', '#CCEBC5', '#A8DDB5', '#7BCCC4', '#4EB3D3', '#2B8CBE', '#0868AC', '#084081']}"

And now draw a map with your data (make a screenshot if you want to use it)

plot(gvisGeoChart(subset(pwt8.0, year == 2011), "country", "cap", options=list(colorAxis = colorbrewer)))

That’s the result:

Or simply show all data, in a very similar way as the Google Public Data Explorer does.

plot(gvisMotionChart(pwt8.0, idvar="country", timevar="year")))]]>