Skip to content

Stat collector solidfire capacity fix#4918

Merged
yadvr merged 5 commits into
apache:4.15from
shapeblue:stat-collector-solidfire-capacity-fix
Apr 21, 2021
Merged

Stat collector solidfire capacity fix#4918
yadvr merged 5 commits into
apache:4.15from
shapeblue:stat-collector-solidfire-capacity-fix

Conversation

@yadvr

@yadvr yadvr commented Apr 13, 2021

Copy link
Copy Markdown
Member
Fixes regression introduced in 71c5dbcf492a023dbea5f8c34f8fd883c3ad653f
which would cause capacity bytes of certain pools to be update which
shouldn't get updated by StatsCollector such as solidfire.

Fixes #4911

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Temporary repo builds here for testing purposes for community: http://download.cloudstack.org/testing/pr-4918/
(This will be removed after the PR is merged)

yadvr added 2 commits April 13, 2021 17:32
Fixes regression introduced in 71c5dbc
which would cause capacity bytes of certain pools to be update which
shouldn't get updated by StatsCollector such as solidfire.

Fixes apache#4911

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
…fault

Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
@yadvr yadvr requested a review from DaanHoogland April 13, 2021 12:09
@yadvr yadvr added this to the 4.15.1.0 milestone Apr 13, 2021
@yadvr

yadvr commented Apr 13, 2021

Copy link
Copy Markdown
Member Author

cc @ravening - pl review the regression was introduced in 71c5dbc
With the fix, anyone upgrading to 4.15 will be unable to use solidfire storage where the capacity bytes is 0.

@yadvr

yadvr commented Apr 13, 2021

Copy link
Copy Markdown
Member Author

@blueorangutan package

@blueorangutan

Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan

Copy link
Copy Markdown

Packaging result: ✔️ centos7 ✔️ centos8 ✔️ debian. SL-JID 405

@yadvr

yadvr commented Apr 13, 2021

Copy link
Copy Markdown
Member Author

@blueorangutan test

@blueorangutan

Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@yadvr

yadvr commented Apr 13, 2021

Copy link
Copy Markdown
Member Author

@blueorangutan test centos7 vmware-67u3

@blueorangutan

Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + vmware-67u3) has been kicked to run smoke tests

@blueorangutan

Copy link
Copy Markdown

Trillian Build Failed (tid-432)

@blueorangutan

Copy link
Copy Markdown

Trillian test result (tid-433)
Environment: vmware-67u3 (x2), Advanced Networking with Mgmt server 7
Total time taken: 35021 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4918-t433-vmware-67u3.zip
Intermittent failure detected: /marvin/tests/smoke/test_diagnostics.py
Smoke tests completed. 86 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_10_traceroute_in_vr Failure 61.32 test_diagnostics.py

Comment thread server/src/main/java/com/cloud/server/StatsCollector.java Outdated
Comment thread server/src/main/java/com/cloud/server/StatsCollector.java
Signed-off-by: Rohit Yadav <rohit.yadav@shapeblue.com>
@yadvr yadvr requested a review from sureshanaparti April 15, 2021 08:47
s_logger.warn("Received 0 capacity for pool ID " + poolId);
}
}
if (pool.getUsedBytes() != ((StorageStats)answer).getByteUsed() && pool.getStorageProviderName().equalsIgnoreCase(DataStoreProvider.DEFAULT_PRIMARY) && !pool.isManaged()) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sureshanaparti I've added a check to not update when pool is managed, cc @skattoju4 @mike-tutkowski

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhtyd I think managed pool(s) doesn't use DefaultPrimary as its provider, in that case managed pool check is not required.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point about managed-storage providers not using DefaultPrimary. Each of these storage providers probably provides its own plug-in.

pool.setUsedBytes(((StorageStats) answer).getByteUsed());
pool.setUpdateTime(new Date());
if (_storagePoolStats.get(poolId) != null && _storagePoolStats.get(poolId).getCapacityBytes() != ((StorageStats)answer).getCapacityBytes()) {
if (((StorageStats)answer).getCapacityBytes() > 0) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sureshanaparti cc @DaanHoogland added check to log warning when capacity is reported 0; it's updated only when the value is > 0.

@yadvr

yadvr commented Apr 15, 2021

Copy link
Copy Markdown
Member Author

@blueorangutan package

1 similar comment
@yadvr

yadvr commented Apr 15, 2021

Copy link
Copy Markdown
Member Author

@blueorangutan package

@blueorangutan

Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan

Copy link
Copy Markdown

Packaging result: ✔️ centos7 ✔️ centos8 ✔️ debian. SL-JID 421

@yadvr

yadvr commented Apr 15, 2021

Copy link
Copy Markdown
Member Author

@blueorangutan test matrix

@blueorangutan

Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins matrix job (centos7 mgmt + xs71, centos7 mgmt + vmware65, centos7 mgmt + kvmcentos7) has been kicked to run smoke tests

Comment thread server/src/main/java/com/cloud/server/StatsCollector.java Outdated
@blueorangutan

Copy link
Copy Markdown

Trillian test result (tid-451)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 33430 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4918-t451-xenserver-71.zip
Intermittent failure detected: /marvin/tests/smoke/test_scale_vm.py
Intermittent failure detected: /marvin/tests/smoke/test_host_maintenance.py
Smoke tests completed. 86 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_scale_vm Failure 9.23 test_scale_vm.py

@blueorangutan

Copy link
Copy Markdown

Trillian test result (tid-452)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 36397 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4918-t452-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_kubernetes_clusters.py
Smoke tests completed. 87 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@blueorangutan

Copy link
Copy Markdown

Trillian test result (tid-457)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 33848 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4918-t457-kvm-centos7.zip
Intermittent failure detected: /marvin/tests/smoke/test_ssvm.py
Smoke tests completed. 86 look OK, 1 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File
test_03_ssvm_internals Failure 3.12 test_ssvm.py

Co-authored-by: Nicolas Vazquez <nicovazquez90@gmail.com>
@yadvr yadvr requested a review from nvazquez April 16, 2021 06:27
s_logger.warn("Not setting capacity bytes, received " + ((StorageStats)answer).getCapacityBytes() + " capacity for pool ID " + poolId);
}
}
if (pool.getUsedBytes() != ((StorageStats)answer).getByteUsed() && pool.getStorageProviderName().equalsIgnoreCase(DataStoreProvider.DEFAULT_PRIMARY)) {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @sureshanaparti @mike-tutkowski - I've addressed the comment, removed the check for managed storage.

@yadvr

yadvr commented Apr 16, 2021

Copy link
Copy Markdown
Member Author

@blueorangutan package

@blueorangutan

Copy link
Copy Markdown

@rhtyd a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan

Copy link
Copy Markdown

Packaging result: ✔️ centos7 ✔️ centos8 ✔️ debian. SL-JID 423

@yadvr

yadvr commented Apr 16, 2021

Copy link
Copy Markdown
Member Author

@blueorangutan test

@blueorangutan

Copy link
Copy Markdown

@rhtyd a Trillian-Jenkins test job (centos7 mgmt + kvm-centos7) has been kicked to run smoke tests

@blueorangutan

Copy link
Copy Markdown

Trillian test result (tid-465)
Environment: kvm-centos7 (x2), Advanced Networking with Mgmt server 7
Total time taken: 32515 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4918-t465-kvm-centos7.zip
Smoke tests completed. 87 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@blueorangutan

Copy link
Copy Markdown

Trillian test result (tid-468)
Environment: vmware-65u2 (x2), Advanced Networking with Mgmt server 7
Total time taken: 35088 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr4918-t468-vmware-65u2.zip
Smoke tests completed. 87 look OK, 0 have error(s)
Only failed tests results shown below:

Test Result Time (s) Test File

@mike-tutkowski mike-tutkowski left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yadvr yadvr merged commit 5051fde into apache:4.15 Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants