Skip to content

Init issue#230

Merged
JeanKossaifi merged 1 commit into
tensorly:mainfrom
caglayantuna:init_issue
Mar 23, 2021
Merged

Init issue#230
JeanKossaifi merged 1 commit into
tensorly:mainfrom
caglayantuna:init_issue

Conversation

@caglayantuna

Copy link
Copy Markdown
Member

his pull request is related to issue #222. We suggest some updates in _cp.py file.

init docstring

Since there are 3 options to initialize tensors but only 2 of them are explained in the docstring, We added "If init is a previously initialized cp tensor, all the weights are pulled in the last factor and then the weights are set to "1" for the output tensor." and 'cp tensor' to 'init : {'svd', 'random', cptensor}, optional'. We have also added uniform distribution information to random option.

Reverse normalization for weights

We added some lines to third option in order to avoid weight conflicts:

            kt = CPTensor(init)
            weights, factors = kt
            
            if tl.all(weights == 1):
                kt = CPTensor((None, factors))
            else:
                for i in range(len(factors)):
                    factors[-1] = factors[-1]*weights[i]
                kt = CPTensor((None, factors))

Thus, if a user puts an input weights with values different than "1", these lines will rescale the last factor accordingly.

Warning for normalize_factors

We have also added a warning as if normalize_factors is True when using an initial tensor. Because it is not useful to normalize after pulling all weights to the last factor.

Mttkrp calculation

Besides, we changed the mttkrp lines which were taking into account weights before. Now, one option is enough to calculate mttkrp.

mttkrp = unfolding_dot_khatri_rao(tensor, (None, factors), mode)

@codecov

codecov Bot commented Mar 1, 2021

Copy link
Copy Markdown

Codecov Report

Merging #230 (65a9c62) into main (06393b2) will decrease coverage by 0.02%.
The diff coverage is 54.54%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #230      +/-   ##
==========================================
- Coverage   86.99%   86.97%   -0.03%     
==========================================
  Files          89       89              
  Lines        4429     4436       +7     
==========================================
+ Hits         3853     3858       +5     
- Misses        576      578       +2     
Impacted Files Coverage Δ
tensorly/decomposition/_cp.py 77.81% <54.54%> (-0.18%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 06393b2...65a9c62. Read the comment docs.

@JeanKossaifi JeanKossaifi left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @caglayantuna, left comments directly in the code.


The type of initialization is set using `init`. If `init == 'random'` then
initialize factor matrices using `random_state`. If `init == 'svd'` then
initialize factor matrices with uniform distribution using `random_state`. If `init == 'svd'` then

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm actually thinking of changing the init so the reconstructed tensor has 0-mean and a variance specified by users, while the factors will have a Gaussian distribution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure I think this makes more sense for unconstrained Parafac. It's not hard to compute the variance of the tensor given the variance of each iid factor entry I can help with that if you want.

However this is not great for nonnegative CP and others, for which keeping the uniform distribution may be better?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I already made the calculations for the variance and implemented them for the torch version, I just need to port it for all backend, will try to do when I get some time.

That's a good point for the CP-NN, do you think uniform is better than a folder gaussian (abs(gaussian)) for initialization? Gaussian distribution might be more predictable when the order of the tensor grows?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even after working man years with nonnegativity, I don't have a strong idea on that. Maybe it's good to leave the choice open to the user.

Comment thread tensorly/decomposition/_cp.py Outdated
try:
try:
if normalize_factors is True:
warnings.warn('It is not recommended to initialize an initial tensor with normalizing. Consider normalizing the initial tensor before using this function')

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably clarify the warning -- the normalization is on the factors, by initial tensor, users might understand the input tensor.

More generally, why wouldn't we want users to request a normalized tensor while also providing their own init?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will clarify the warning indeed.

The behavior we wanted to avoid was

  1. the user inputs a tensor with non-unit weights
  2. the weights are pulled on the last factor
  3. the factors are normalized
    In that case I was worried that the weights pulling was useless and that could confuse the user, therefore suggesting to normalize beforehand. Indeed a user might want to normalize his input initialization, but then is it not a better practice to do it outside the init function, so that the init function does not change the init tensor itself? To me the normalization option should only be allowed for random or svd init.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's good to warn the user that their normalization will be overwritten. But that will anyway right? I don't think the user expects the initialization to not change? I tend to just view (weights, factors) as a generic representation in CP form rather than a normalized CP. They may have gotten the (weights, factors) from e.g. a previous run of CP with normalize=True and just want to do warm restart.

I'm actually curious about the actual impact of normalization on stability within CP.

Comment thread tensorly/decomposition/_cp.py Outdated
kt = CPTensor((None, factors))
else:
for i in range(len(factors)):
factors[-1] = factors[-1]*weights[i]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why that factor rather than any other? At first glance, I don't have a strong intuition about that - would it affect convergence/stability? Another thing I wanted to add was to actually spread the weights in all the factors (more generally, to manipulate decomposed factors without changing reconstruction).

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, there is no specific reason for that. Scaling only one factor doesn't effect the reconstruction. If you prefer, I would change to scale all the factors with related weights.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could even be a separate function in that case, e.g. to make all factors equal weights? Do you think it would make sense to check the numerical stability when i) absorbing all weights in one factors and ii) spreading them in all factors?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we agreed last week, we updated factor initialization by spreading weights in all factors. In order to avoid unstable weight values, we update all the factors with geometric mean of weight values. I hope current version is ok for merging

@caglayantuna caglayantuna force-pushed the init_issue branch 2 times, most recently from d3652ef to 4592da2 Compare March 16, 2021 14:16
@JeanKossaifi

Copy link
Copy Markdown
Member

Thanks @caglayantuna, looks good to me.
Could you do a little rebase to have a clean history?

@caglayantuna

Copy link
Copy Markdown
Member Author

Hey @JeanKossaifi , thanks. I squashed all the commits into one commit. But, I couldn't understand the problem with MxNet.

@JeanKossaifi

Copy link
Copy Markdown
Member

Thanks @caglayantuna, merging!
The MXNet issue seems to be due to some missing dependencies, will try to fix it when I get a moment.

@JeanKossaifi JeanKossaifi merged commit 7c9d410 into tensorly:main Mar 23, 2021
@caglayantuna caglayantuna deleted the init_issue branch March 29, 2021 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants