When something is focused with VoiceOver, if you double tap on the screen, it will be like interacting with the centre of the focused element. If you need to change that, you can customise the accessibilityActivationPoint.

You may also find interesting...

Grouping elements when it makes sense can make a huge impact on easing navigation with some assistive technologies like VoiceOver, Switch Control, or Full Keyboard Access. It also helps on reducing redundancy.
Guidelines from Apple: Begin with a verb that explains the results of the action. Avoid using the imperative form of a verb because that can make it sound like a command. Don’t include the action type. Don’t include the control. https://developer.apple.com/documentation/objectivec/nsobject-swift.class/accessibilityhint

Too much data can overwhelm users. Very little is an incomplete experience. It is hard to find a balance on verbosity and the users may have different preferences. To help with this issue, the AXCustomContent APIs let you mark data as optional.