EVP is still what you refer to when discussing what you find as 'evidence' from the Frank's Box or Shack Hack. Electronic Voice Phenom is perhaps what some believe to be conscious energy MANIPULATING energy from the Electromagnetic spectrum to communicate with us (the living).
Researchers do not believe the voices aren't possibly radio stations; in fact you try and measure the sweeping to see what the possible sentence link would be. YOu also try and rationalize with logic how intelligent and/or responsive the results are to your questions to rule out chance. No radio is designed to speak with the dead but the hacks are designed in order to try. With sweeping, removing the mute pin and enhancing the white noise you are trying to provide opportunity.
Why use this form? Because intent has a lot to do with intelligent communication. When researchers arrive, announce their presence and bring a tool that provides opportunity, well... logic tells us that you use the easiest means necessary. Conscious energy or what I call 'ghosts' do the same.
If one considers this as a viable method then it also is a requirement to understand the exact method such "white noise" and portions of sentences are manipulated. That includes the precise voltage requirements to select the desired station these boxes tune to at the precise instant needed to assemble the desired phrase.
Digital tuning is done by placing a DC voltage on the tuning input of a chip which sets the frequency of the local oscillator. In AM this frequency is hetrodyned, or beat against the radio station frequency of the desired station. A difference, or IF frequency, is derived, amplified and detected. In a nutshell, this is how modern radio works.
Now if the spirit is to cause a reception of a particular station it must be able to place the precise DC voltage on the tuning pin needed to get the desired station. Plus it must do so without error and only for the time period required to pick up the phoneme or phrase it wants, then immediately switch to the next station by seamlessly changing the DC voltage on the tuning input. Since most phonemes of speech last from 50 - 200 msec. this is the rate that the spirit must select phonemes. If one can explain how the spirit is able to control the tuning voltage with this level of precision then an argument might be made for a ghost box.
But there is yet one more vital factor to consider as evidenced from the above paragraph. Even if the spirit were able to achieve this level of control, it still has to know WHICH station to select. That means it has to listen to all available stations AND know what the announcer is going to say before he says it! Otherwise how can the spirit choose a station?
Like most ITC methods this is fraught with problems. But you are correct in one aspect. If a spirit wishes to communicate it will use the easiest method available. That would be simply physical manipulation of something present. Or possibly causing an electrical disturbance which can be directly picked up and amplified. After all if this hypothetical spirit can manipulate a DC control pin on a chip it must be able to generate an electrical signal. So rather than add a bunch of garbage in the form of white noise or other interference, why not use a good high gain amplifier and a quiet environment so that the spirit need not generate such a high level signal to get over our noise? Everyone knows we can hear better in a quiet room than in a noisy environment, so why not make it easy for the spirit to be heard?