# Escaping the simplex, part 1

[This article was first published on

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Before tackling the main subject, two quick notes:**logopt: a journey in R, finance and open source**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

- I did not post for quite a while in part because I followed the Coursera online course Introduction to Computational Finance and Financial Econometrics. It was a nice refresher, extremely well presented, and including some R. This did consume enough time to make posting cumbersome though.
- I started using R studio, and I am quite happy with it under Windows. This has also impacted my code, as R studio automatically stacks your plots.

The main reason to do so is that the Best Constant Rebalanced Portfolio (BCRP) may lie outside. This is a well known feature in the more traditional mean-variance analysis, where the constrained portfolio is sparser than the unconstrained one. We have a similar effect for the BCRP, the constrained optimization results in a sparse portfolio, i.e. many weights are (close to) zero.

We can illustrate this by calculating the weights of the constrained BCRP. The BCRP function in logopt is currently only supporting optimization on the simplex, and we can reuse the code originally presented in Universal portfolio, part 10 to accumulate the portfolio weights for BCRP across multiple combinations of the reference data. This shows that the proportion of (close to) zero weights is high.

The code below as is runs for about 7 hours because of the large number of combinations, you can reduce the size of the problem if wanted by editing TupleSizes.

Note that in the past Syntax Highlighter had problems with R-bloggers, you might need to go to the original page to view the code.

Note that in the past Syntax Highlighter had problems with R-bloggers, you might need to go to the original page to view the code.

# assume the use of RStudio (no explicit management of graphic devices) library(logopt) data(nyse.cover.1962.1984) x <- coredata(nyse.cover.1962.1984) nStocks <- dim(x)[2] EvaluateOnAllTuples <- function(ListName, TupleSizes, fFinalWealth, ...) { if (exists(ListName) == FALSE) { LocalList <- list() for (i in 1:length(TupleSizes)) { TupleSize <- TupleSizes[i] ws <- combn(x=(1:nStocks), m=(TupleSize), FUN=fFinalWealth, simplify=TRUE, ...) LocalList[[i]] <- ws } assign(ListName, LocalList, pos=parent.frame()) } } TupleSizes <- c(2,3,4,5) # evaluate the sorted coefficients for best CRP SortedOptB <- function(cols, ...) { x <- list(...)[[1]] ; x <- x[,cols] b <- bcrp.optim(x) return(sort(b)) } TupleSizes <- c(2,3,4,5) EvaluateOnAllTuples("lSortedOptBSmall", TupleSizes, SortedOptB, x) TupleSizes <- c(nStocks-3,nStocks-2,nStocks-1,nStocks) EvaluateOnAllTuples("lSortedOptBLarge", TupleSizes, SortedOptB, x) Colors <- c("red", "green", "blue", "brown", "cyan", "darkred","darkgreen") for (iL in 1:length(lSortedOptBSmall)) { nCoeff <- nrow(lSortedOptBSmall[[iL]]) E <- ecdf(lSortedOptBSmall[[iL]][1,]) Title <- sprintf("Cumulative PDF for all sorted weights for BCRP of %d assets", nCoeff) plot(E, col=Colors[1], xlim=c(0,1), pch=".", main = Title, xlab = "relative weight") cat(sprintf("Combinations of %d assets\n", nCoeff)) for (iB in 1:nCoeff) { E <- ecdf(lSortedOptBSmall[[iL]][iB,]) lines(E, col=Colors[iB], pch=".") if (iB < nCoeff) { cat(sprintf("Percent of portfolio with %d weight(s) smaller than 0.001: %f\n", iB, E(0.001) )) } } } SmallOf2 <- lSortedOptBSmall[[1]][1,] hist(SmallOf2,n=25,probability=TRUE, main="Histogram of smallest weight for two assets", xlab="Smallest coefficient") lines(density(SmallOf2, bw=0.02),col="blue") # for the large ones, we show only the largest coefficients and show them in # opposite order to find how many coefficients are not always insiginifcant for (iL in 1:length(lSortedOptBLarge)) { nCoeff <- nrow(lSortedOptBLarge[[iL]]) E <- ecdf(lSortedOptBLarge[[iL]][nCoeff,]) Title <- sprintf("Cumulative PDF for largest sorted weights for BCRP of %d assets", nCoeff) plot(E, col=Colors[1], xlim=c(0,1), pch=".", main = Title, xlab = "relative weight") cat(sprintf("Combinations of %d assets\n", nCoeff)) for (iB in 1:nCoeff) { iCoeff <- nCoeff-iB+1 E <- ecdf(lSortedOptBLarge[[iL]][iCoeff,]) if (E(0.001) < 0.9999) { lines(E, col=Colors[iB], pch=".") cat(sprintf("Percent of portfolio with %d weight(s) smaller than 0.001: %f\n", iCoeff, E(0.001) )) } } }

Only a subset of the results are shown below, first the cumulative probability function for the weights of the coefficients for all combinations of 5 assets. The graph shows that the smallest coefficient is almost always 0 and the second coefficient is also very small all of the time. The textual output shows that the second coefficient is insignificant 91% of the time or equivalently 91% of the best portfolio only uses 3 of the 5 possible assets.

Combinations of 5 assets

Percent of portfolio with 1 weight(s) smaller than 0.001: 0.996499

Percent of portfolio with 2 weight(s) smaller than 0.001: 0.914086

Percent of portfolio with 3 weight(s) smaller than 0.001: 0.499448

Percent of portfolio with 4 weight(s) smaller than 0.001: 0.067975

The density for the two asset case shows clearly that there is a high peak at zero.

Finally the textual output and the ECDF shows that for 33 possible assets, only up to 7 are present in the BCRP

Combinations of 33 assets

Percent of portfolio with 33 weight(s) smaller than 0.001: 0.000000

Percent of portfolio with 32 weight(s) smaller than 0.001: 0.000000

Percent of portfolio with 31 weight(s) smaller than 0.001: 0.000000

Percent of portfolio with 30 weight(s) smaller than 0.001: 0.000000

Percent of portfolio with 29 weight(s) smaller than 0.001: 0.065126

Percent of portfolio with 28 weight(s) smaller than 0.001: 0.707843

Percent of portfolio with 27 weight(s) smaller than 0.001: 0.947059

All this points to the fact that most of the weights of the BCRP end up on the boundary of the simplex, and that removing that specific constraint would get an even better solution, at least in term of terminal wealth. We'll investigate further this in future posts.

To

**leave a comment**for the author, please follow the link and comment on their blog:**logopt: a journey in R, finance and open source**.R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.