Existing deep learning systems in the Internet of Things (IoT) environments lack the ability of assigning compute tasks reasonably which leads to resources wasting. In this letter, we propose an AAIoT, a method to allocate the inference computation of each network layer to each device in multi-layer IoT system. To our best knowledge, this is the first attempt to solve this problem. We design a dynamic programming algorithm to minimize the response time when weighing the cost of computation and transmission. Simulation results show that our approach makes significant improvements in system response time.